
COPY OF PAP t ft 
ORIGINAL!. ;u=; 



Attorney's Docket No.: 13425-042001 / 00298-US 

4*7 



Applicant 
Serial No. 
Filed 
Title 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE rri 

1645 



Sven Enerback et al. 
09/963,285 
September 26, 2001 
PROMOTER SEQUENCES 



Art Unit 
Examiner 



Commissioner for Patents 
Washington, D.C. 20231 



o 
m 



cz> 
ro 

CO 




TRANSMITTAL OF CERTIFIED PRIORITY DOCUMENT UNDER 35 USC §119 

In accordance with the provisions of 35 U.S.C. §119, applicants hereby claim priority of 
Swedish Patent application No. 0003435-5, filed September 26, 2000. A certified copy of the 
application is submitted herewith. As the priority application is in the English language, all of 
the requirements of § 1 19 have been met. 

Please apply any charges or credits to Deposit Account No. 06-1050, re 
Attorney Docket No. 13425-042001. 

Respectfully submitted, 



Date: % tit I 



Fish & Richardson P.C. 

225 Franklin Street 

Boston, Massachusetts 021 10-2804 

Telephone: (617)542-5070 

Facsimile: (617)542-8906 




Jaak Brennan 
Reg. No. 47,443 



20331720.doc 



CERTIFICATE OF MAILING BY FIRST CLASS MAIL 

I hereby certify under 37 CFR § 1.8(a) that this correspondence is being 
deposited with the United States Postal Service as first class mail with 
sufficient postage on the date indicated below and is addressed to the 
Commissioner for Patents, Washington, D.C. 2023 1 . 



HifUMtto p o ol 



Date of Deposit 



Signature 

DacW J, Mo rift 

Typed or Printed Name of Person Signing Certificate 



PRV 



% 

JAN 1 0 2002 * 

/COPYOFPAPth. 
ORIGINALLY PILED 



PATENT- OCH REGISTREIUNGSVERKETN^^^^ 

Patentavdelningen " 

Intyg 

C rtificat 

Harmed intygas att bifogade kopior overensstammer med de 
handling ar som ur sprung ligen ingi/its till Patent- och 
^qj^^ registreringsverket i nedannamnda ansok&n. 

^ This is to certify that the annexed is a true copy of 

O " the documents as originally filed with the Patent- and 

^Hf BSf 0 Registration Office in connection with the following 

^ ^ patent application. 

% ■» ^ 

9 (71) Sdkande Pharmacia & Upjohn AB, Stockholm SB 
Applicant (s) 1 



\ (21) Patent ansokningsnummer 0003435-5 5? 

Patent application number 



(86 ) Ingivningsdatum 2000-09-26 



Date of filing , 



ax 

-fc rn 
o 
m 

iri 




1^ 

Stockholm, 2001-09-19 <%> ^ 



.Far Patent- och registreringsverket *SL, 
For the Patent- and Registration Office & 

Kerstin Garden 
Avgift 

Fee 110:- 



3k V, 



PATENT- OCH Postadress/Adress Telefon/Phone Telex Telefax 

REGISTRERINGSVERKET Box5055 +46 8 782 25 00 17978 6 666 02 66 

SfiSLeli S-102 42 STOCKHOLM Vx 08-782 25 00 PATOREG S 08-666 02 86 




COPY OF p.,. 
ORlGfNALLV , 



% 4 / % 

PROMOTER SEQUENCES £5 ^ 



TECHNICAL FIELD S> g 

The present invention relates an isolated promoter region of the mammalian transcription * 7 
factor FOXC2. The invention also relates to screening methods for agents modulating the 
expression of FOXC2 and thereby being potentially useful for the treatment of medical 
conditions related to obesity. The invention further relates to a previously unknown variant 
of the human FOXC2 gene, derived via the use of an alternative promoter, which produces 
an additional exon that generates a distinct open reading frame via splicing. The alternative 
gene encodes a variant of the FOXC2 transcription factor, which is lacking a part of the ^^tv 
DNA-binding domain and consequently has a potential regulatory function. f*} 

BACKGROUND ART ^ v * 



More than half of the men and women in the United States, 30 years of age and older, are ^ 
now considered overweight, and nearly one-quarter are clinically obese. This high 
prevalence has led to increases in the medical conditions that often accompany obesity, 
especially non-insulin dependent diabetes mellitus (NIDDM), hypertension, cardiovascular 
disorders, and certain cancers. Obesity results from a chronic imbalance between energy 
intake (feeding) and energy expenditure. To better understand the mechanisms that lead to 
obesity and to develop strategies in certain patient populations to control obesity, there is a 
need to develop a better underlying knowledge of the molecular events that regulate the 
differentiation of preadipocytes and stem cells to adipocytes, the major component of 
adipose tissue. 

The helix-loop-helix (HLH) family of transcriptional regulatory proteins are key players in 
a wide array of developmental processes (for a review, see Massari & Murre (2000) Mol. 
Cell. Biol. 20: 429-440). Over 240 HLH proteins have been identified to date in organisms 
ranging from the yeast Saccharomyces cerevisiae to humans. Studies in Xenopus laevis, 
Drosophila melanogaster, and mice have convincingly demonstrated that HLH proteins are 
intimately involved in developmental events such as cellular differentiation, lineage 
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commitment, and sex determination. In multicellular organisms, HLH factors are required 
for a multitude of important developmental processes, including neurogenesis, myogenesis, 
hematopoiesis, and pancreatic development. 

5 The winged helix / forkhead class of transcription factors is characterized by a 1 00-amino 
acid, monomeric DNA-binding domain. X-ray crystallography of the forkhead domain 
from HNF-3y has revealed a three-dimensional structure, the "winged helix", in which two 
loops (wings) are connected on the C-terminal side of the helix-loop-helix (for reviews, see 
Brennan, R.G. (1993) Cell 74: 773-776; and Lai, E. et al. (1993) Proc. Natl. Acad. Sci. 
10 U.S.A. 90; 10421-10423). 

The isolation of the mouse mesenchyme forkhead- 1 (MFH-1) and the corresponding 
human (FKHL14) chromosomal genes is disclosed by Miura, N. et al. (1993) FEBS letters 
326: 171-176; and (1997) Genomics 41: 489-492. The nucleotide sequences of the mouse 
15 MFH-1 gene and the human FKHL14 gene have been deposited with the EMBL/GenBank 
Data Libraries under accession Nos. Y08222 (SEQ ID NO: 5) and Y08223 (SEQ ID NO: 
8), respectively. A corresponding gene has been identified in Gal his gallus (GenBank 
accession numbers U37273 and U95823). 

20 The International Patent Application WO 98/54216 discloses a gene encoding a Forkhead- 
Related Activator (FREAC)-l 1 (also known as SI 2), which is identical with the 
polypeptide encoded by the human FKHL14 gene disclosed by Miura, supra. This 
transcription factor is expressed in adipose tissue and involved in lipid metabolism and 
adipocyte differentiation (cf. Swedish patent application No. 0000531-4, filed February 18, 

25 2000). 

The nomenclature for the winged helix / forkhead transcription factors has been 
standardized and Fox (Forkhead Box) has been adopted as the unified symbol (Kaestner et 
al. (2000) Genes & Development 14: 142-146; see also htpp://www.biology,pomona.edu/ 
30 fox). It has been agreed that the genes previously designated MFH-1 and FKHL14 (as well 
as FREAC-1 1 and SI 2) should be designated FOXC2. 
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BRffiF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows the general structure of the human FOXC2 gene. 

Figure 2 illustrates the results from phylogenetic footprinting experiments. Shown is the 
fraction conserved (1 .0 = 100%) between mouse FoxC2 and human FOXC2 sequences in 
the alignment generated with Clustal. Solid (bold) line indicates the fraction of the human 
sequence which is identical to the mouse within a 200 bp "window" over the human 
sequence in the alignment. The weak (dotted) line is set to -0,05 when the sliding window 
contains human exon sequence and to -0. 1 when the window is entirely composed of exon 
sequence. Regions containing local maxima or exceeding a conservation fraction of 0.7 are 
likely to be functional and are classified as "predicted regulatory regions". 

Figure 3 illustrates the predicted "enhancer" region in the human FOXC2 gene. Underlined 
15 sequences indicate likely transcription factor binding sites. Boxed sequence indicates exon 
sequence. 

Splice - sequence predicted as splice site in the alternatively spliced gene; 
E-box-like ~ sequence resembling the "E-box" motif CANNTG known as a target for DNA 
binding proteins containing a helix-loop-helix domain (often associated with the activation 
20 of cell-type specific gene transcription during tissue differentiation; see Massari & Murre 
(2000) Mol. Cell. Biol. 20: 429-440) 

Forkhead-like = sequence resembling binding site for the winged helix / forkhead class of 
transcription factors; 

Ets-like = sequence resembling consensus binding site for ETS-domain transcription factor 
25 family (see Sharrocks et al. (1997) Int. J. Biochem. Cell Biol 29, 1371-1387). 

Figure 4 illustrates the predicted "promoter" region in the human FOXC2 gene. Underlined 
sequence indicates exon sequences. Boxed sequences indicate conserved block (potential 
transcription factor binding sites). 



5 



%1 

10 



30 



DESCRIPTION OF THE INVENTION 



According to the present invention, the partially known sequence (SEQ ID NO: 8) of 
human FOXC2 gene has been extended. In the previously unknown region of the gene, 
differentially conserved regions, consistent with regulatory function, have been identified. 
Further, an alternative transcript has been identified, which includes the use of at least two 
exons. The putative regulatory enhancer is immediately adjacent to the newly discovered 
alternative exon, suggesting that it may play a role in the alternative selection of transcript 
classes. 

Modulation of the FOXC2 regulation is expected to have therapeutic value in type II 
diabetes; obesity, hypercholesterolemia, and other cardiovascular diseases or 
dyslipidemias. 

Consequently, in a first aspect this invention provides a human FOXC2 promoter region 
comprising a sequence selected from: 

(a) the nucleotide sequence set forth as positions 1250 to 2235, such as positions 
1250 to 1749 or positions 1692 to 1703, in SEQ ED NO: 1, or a fragment thereof 
exhibiting FOXC2 promoter activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 

Another aspect of the invention is a recombinant construct comprising the human FOXC2 
promoter region as defined above. In the said recombinant construct, the human FOXC2 
promoter region can be operably linked to a gene encoding a detectable product, such as 
the human FOXC2 gene, or a reporter gene. The term "operably linked" as used herein 
means functionally fusing a promoter with a structural gene in the proper frame to express 
the structural gene under control of the promoter. As used herein, the term "reporter gene" 
means a gene encoding a gene product that can be identified using simple, inexpensive 
methods or reagents and that can be operably linked to the human FOXC2 promoter region 
or an active fragment thereof. Reporter genes such as, for example, a luciferase, P- 
galactosidase, alkaline phosphatase, or green fluorescent protein reporter gene, can be used 



to determine transcriptional activity in screening assays according to the invention (see, for 
example, Goeddel (ed.), Methods EnzymoL, Vol. 185, San Diego: Academic Press, Inc. 
(1990); see also Sambrook, supra). 

The invention also provides a vector comprising the recombinant construct as defined 
above, as well as a host cell stably transformed with such a vector, or generally with the 
recombinant construct according to the invention. The term "vector" refers to any carrier of 
exogenous DNA that is useful for transferring the DNA to a host cell for replication and/or 
appropriate expression of the exogenous DNA by the host cell. 

In another aspect, the invention provides a method for identification of an agent regulating 
FOXC2 promoter activity, said method comprising the steps: (i) contacting a candidate 
agent with a human FOXC2 promoter region as defined above; and (ii) determining 
whether said candidate agent modulates expression of the FOXC2 gene, such modulation 
being indicative for an agent capable of regulating FOXC2 promoter activity. As used 
herein, the term "agent" means a biological or chemical compound such as a simple or 
complex organic molecule, a peptide, a protein or an oligonucleotide. 

A transfection assay can be a particularly useful screening assay for identifying an 
effective agent. In a transfection assay, a nucleic acid containing a gene such as a reporter 
gene that is operably linked to a human FOXC2 promoter, or an active fragment thereof, is 
transfected into the desired cell type. A test level of reporter gene expression is assayed in 
the presence of a candidate agent and compared to a control level of expression. An 
effective agent is identified as an agent that results in a test level of expression that is 
different than a control level of reporter gene expression, which is the level of expression 
determined in the absence of the agent. Methods for transfecting cells and a variety of 
convenient reporter genes are well known in the art (see, for example, Goeddel (ed.), 
Methods EnzymoL, Vol. 185, San Diego: Academic Press, Inc. (1990); see also Sambrook, 
supra). Consequently, the said method could e.g. comprising assaying reporter gene 
expression in a host cell, stably transformed with a recombinant construct comprising the 
human FOXC2 promoter, in the presence and absence of a candidate agent, wherein an 
effect on the test level of expression as compared to control level of expression is 
indicative of an agent capable of regulating FOXC2 promoter activity. 
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In a further aspect, the invention provides a human FOXC2 enhancer region 
comprising a sequence selected from; 

(a) the nucleotide sequence set forth as positions 216 to 475, such as positions 223 to 
5 23 1 , positions 359 to 375, positions 378 to 402, or positions 403 to 423, in SEQ ID 

NO: 1 , or a fragment thereof exhibiting FOXC2 enhancer activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 

10 

As described above for the human FOXC2 promoter region, the invention further provides 
a recombinant construct comprising a human FOXC2 enhancer region, a vector comprising 
the said recombinant construct, as well as a host cell stably transformed with said vector or 
with said recombinant construct. 

15 

Further, the invention provides a method for identification of an agent regulating FOXC2 
enhancer activity, said method comprising the steps: (i) contacting a candidate agent with 
the human FOXC2 enhancer region as defined above; and (ii) determining whether said 
candidate agent modulates expression of the FOXC2 gene, such modulation being 

20 indicative for an agent capable of regulating FOXC2 enhancer activity. It will be 

understood by the skilled person that known steps are available for performing such a 
method. For instance, a "panel" of constructs which include a variety of mutations and 
deletions can be used in order to associate a response with a specific alteration of a single 
base or subsegment of the regulatory apparatus. A simple panel might include: enhancer 

25 plus promoter, promoter only, enhancer plus a "minimal" promoter from a distinct gene. 
As mentioned above, a transfection assay, using a host cell stably transformed with a 
suitable recombinant construct, can be a particularly useful screening assay for identifying 
an effective agent. 

30 In yet a further aspect, the invention provides a method for identification of an agent 

capable of regulating a mammalian FOXC2 promoter activity, said method comprising the 
steps (i) contacting a candidate agent with a murine FoxC2 promoter nucleotide sequence 
shown as positions 216 to 2235, such as positions 216 to 475 or positions 1250 to 2235, in 
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SEQ ID NO: 5; and (ii) determining whether said candidate agent modulates expression of 
a mammalian FOXC2 gene, such modulation being indicative for an agent capable of 
regulating mammalian FOXC2 promoter activity. 

5 In another important aspect, the invention provides an isolated nucleic acid molecule 
selected from: 

(a) nucleic acid molecules comprising a nucleotide sequence as shown in SEQ ID NO: 3; 

(b) nucleic acid molecules comprising a nucleotide sequence capable of hybridizing, under 
stringent hybridization conditions, to a nucleotide sequence complementary the 

10 polypeptide coding region of a nucleic acid molecule as defined in (a) and which codes for 
a variant form of the FOXC2 transcription factor; and 

(c) nucleic acid molecules comprising a nucleic acid sequence which is degenerate as a 
result of the genetic code to a nucleotide sequence as defined in (a) or (b) and which codes 
for a variant form of the FOXC2 transcription factor. 

15 

In a preferred form of the invention, the said nucleic acid molecule has a nucleotide 
sequence identical with SEQ ID NO: 3 of the Sequence Listing. However, the nucleic acid 
molecule according to the invention is not to be limited strictly to the sequence shown as 
SEQ ID NO: 3. Rather the invention encompasses nucleic acid molecules carrying 

20 modifications like substitutions, small deletions, insertions or inversions, which 

nevertheless encode proteins having substantially the biochemical activity of the FOXC2 
polypeptide according to the invention. Included in the invention are consequently nucleic 
acid molecules, the nucleotide sequence of which is at least 90% homologous, preferably 
at least 95% homologous, with the nucleotide sequence shown as SEQ ID NO: 3 in the 

25 Sequence Listing. 

Included in the invention is also a nucleic acid molecule which nucleotide sequence is 
degenerate, because of the genetic code, to the nucleotide sequence shown as SEQ ID NO: 
3. A sequential grouping of three nucleotides, a "codon", codes for one amino acid. Since 
30 there are 64 possible codons, but only 20 natural amino acids, most amino acids are coded 
for by more than one codon. This natural "degeneracy", or "redundancy", of the genetic 
code is well known in the art. It will thus be appreciated that the nucleotide sequence 
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shown in the Sequence Listing is only an example within a large but definite group of 
sequences which will encode the variant FOXC2 polypeptide. 

The invention includes an isolated polypeptide encoded by the nucleic acid as defined 
5 above. In a preferred form, the said polypeptide has an amino acid sequence according to 
SEQ ID NO: 4 of the Sequence Listing. However, the polypeptide according to the 
invention is not to be limited strictly to a polypeptide with an amino acid sequence 
identical with SEQ ID NO: 4 in the Sequence Listing. Rather the invention encompasses 
polypeptides carrying modifications like substitutions, small deletions, insertions or 
io inversions, which polypeptides nevertheless have substantially the biological activities of 
the variant FOXC2 polypeptide. 

A further aspect of the invention is a vector harboring the nucleic acid molecule according 
to the invention. The said vector can e.g. be a replicable expression vector, which carries 
15 and is capable of mediating the expression of a DNA molecule according to the invention. 
In the present context the term "replicable" means that the vector is able to replicate in a 
given type of host cell into which is has been introduced. Examples of vectors are viruses 
such as bacteriophages, cosmids, plasmids and other recombination vectors. Nucleic acid 
molecules are inserted into vector genomes by methods well known in the art. 

20 

Included in the invention is also a cultured host cell harboring a vector according to the 
invention. Such a host cell can be a prokaryotic cell, a unicellular eukaryotic cell or a cell 
derived from a multicellular organism. The host cell can thus e.g. be a bacterial cell such as 
an E. coli cell; a cell from yeast such as Saccharomyces cervisiae or Pichia past oris, or a 
25 mammalian cell. The methods employed to effect introduction of the vector into the host 
cell are standard methods well known to a person familiar with recombinant DNA 
methods. 

In yet another aspect, the invention includes a method for identifying an agent capable of 
30 regulating expression of the nucleic acid molecule as defined above, said method 

comprising the steps (i) contacting a candidate agent with the said nucleic acid molecule; 
and (ii) determining whether said candidate agent modulates expression of the said nucleic 
acid molecule. 
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In another aspect the invention provides an antisense oligonucleotide having a sequence 
capable of specifically hybridizing to RNA transcribed by the alternatively spliced nucleic 
acid molecule shown as SEQ ID NO: 3, so as to prevent translation of the said RNA. 
5 Antisense nucleic acids (preferably 10 to 20 base-pair oligonucleotides) capable of 
specifically binding to control sequences for the alternatively spliced FOXC2 gene are 
introduced into cells, e.g. by a viral vector or colloidal dispersion system such as a 
liposome. The antisense nucleic acid binds to the target nucleotide sequence in the cell and 
prevents transcription and/or translation of the target sequence. Phosphorothioate and 
10 methylphosphonate antisense oligonucleotides are specifically contemplated for 

therapeutic use by the invention. Suppression of expression of the alternatively spliced 
FOXC2 gene, at either the transcriptional or translational level, is useful to generate 
cellular or animal models for diseases/conditions related to lipid metabolism. 

Throughout this description the terms "standard protocols" and "standard procedures", 
when used in the context of molecular biology techniques, are to be understood as 
protocols and procedures found in an ordinary laboratory manual such as: Current 
Protocols in Molecular Biology, editors F. Ausubel et al., John Wiley and Sons, Inc. 1994, 
or Sambrook, J., Fritsch, E.R and Maniatis, X, Molecular Cloning: A laboratory manual, 
2 nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY 1 989. 

EXAMPLES 

Additional features of the invention will be apparent from the following Examples. 
Examples 1 to 4 are actual, while the remaining Examples are prophetic. 

EXAMPLE 1 : Computational identification of FOXC2 genomic sequences 

The sequences present in the GenBank database (http://www.ncblnlm.nih.gov) were 
screened for sequence similarity to the human FOXC2 cDNA sequence (GenBank 
accession number NM_00521 (SEQ ID NO: 9)). The BLAST algorithm (Altschul et al. 
(1997) Nucleic Acids Res. 25:3389-3402) was used for determining sequence identity. 



15 



20 
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Software for performing BLAST analyses is publicly available through the National Center 
for Biotechnology Information 0ittp://www.ncblnlm.nihgov)> A working draft genomic 
sequence in 25 unordered pieces, from the Homo sapiens chromosome 16 clone RP1 1- 
46309 (GenBank accession number AC009108; Version 6; GI:7689930; released 4 May 
5 2000), was selected for further studies. 

Regions in sequence AC009108 matching portions of the FOXC2 cDNA sequence 
NM_005251were combined using the PHRAP software, developed at the University of 
Washington (http;/Avww.genome.washington.edt4/UWGC/a^ Two 
io contigs of 9780 bp (positions 1 16445 to 126224 in GenBank AC009 108.6) and 3784 bp 
(positions 42927 to 46710 in GenBank AC0091 108.6), respectively, were assembled to 
generate a human FOXC2 genomic fragment of 1 345 1 bp. 

The ClustalW multiple sequence alignment program, version 1 .8 (Thompson et al. (1 994) 
15 Nucleic Acids Research 22: 4673-4680), was then used to identify the human FOXC2 
extended genomic DNA sequence of 6458 bp (SEQ ID NO: 1) by comparison with the 
mouse cDNA sequence X74040 (SEQ ID NO: 6). First, a 6459 bp sequence, corresponding 
to positions 1 500-7958 in the 13451 bp sequence, was selected. Positions 1-2285 in this 
6459 bp sequence corresponded to 44426-46710 in AC009108.6, while positions 2151- 
20 6459 corresponded to positions 1 26224-1 2 1916 (reverse complement taken) in 

AC009108.6. The overlap of positions 2151-2285 allowed for the contigs to be joined by 
the assembly program. The G residue in position 2655 was considered to be a sequencing 
error and was removed, which resulted in the 6458 bp sequence set forth as SEQ ID NO: 1 . 
The open reading frame in SEQ ID NO: 1 encodes a polypeptide (SEQ ID NO: 2) identical 
: • . 25 with the known human FOXC2 polypeptide shown as SEQ ID NO: 1 0. 

* 

r : EXAMPLE 2: Identification of potential regulatory sequences in the human and mouse 

- : - FOXC2 genomic sequences 

\: 30 

mmm * In phylogenetic footprinting (for a review, see Duret & Bucher (1 997) Current Opinion in 

Structural Biology 7(3): 399-406) sequences are aligned and a regional sequence identity is 

, . . : determined for each window of a fixed, arbitrary length. This allows the identification of 
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potential regulatory regions in genomic sequences. Non-exon sequences that are conserved 
over the course of evolution are likely to perform regulatory roles. Phylogenetic 
footprinting was performed as described in Wassennan & Fickett (1998) J. Mol. Biol. 278, 
1 67- 181, based on an alignment generated with the ClustalW multiple sequence alignment 
5 program, version 1 .8 (Thompson et al. (1994) Nucleic Acids Research 22: 4673-4680), 
with default parameters adjusted to a gap opening penalty of 20 and a gap extension 
penalty of 0.2. The human (SEQ ID NO: 1) and mouse (SEQ ID NO: 5) genomic 
sequences were aligned. Percentage identity was plotted for each contiguous 200 bp 
segment of the human gene to identify segments differentially conserved (in comparison to 
10 adjoining sequences) (Fig. 2). 

In addition to segments of the published exon sequence, two differentially conserved 
regions or "footprints" were identified in the human gene. Both of these regions are local 
maxima and contain segments which exceed 70% nucleotide identity between the human 

15 and mouse genomic sequences. One region, shown as positions 1250 to 2235, in particular 
positions 1250 to 1749, in SEQ ED NO: 1, immediately adjacent to the published exon 
region, is likely to contain the transcription start site and proximal promoter regulatory 
sequences (Fig. 4). Another region, shown as positions 216 to 475 in SEQ ID NO: 1, 
approximately 1 700 bp distal from the transcription start site, is likely to function as some 

20 form of regulatory region (either enhancer or repressor) (Fig. 3). (A schematic overview of 
the extended FOXC2 gene is shown in Fig. 1 .) 

Further analysis of these regulatory regions identified short segments of higher 
conservation between the mouse and human genes, suggesting that these specific segments 
25 function as transcription factor binding sites. Screening of the TRANSFAC transcription 
factor database (http://transfac.gbf.de) (see Wingender et al. (2000) Nucleic Acids 
Research 28(1): 316-319) for matches to known transcription factors suggested the 
presence of multiple forkhead-like binding sites in the distal regulatory enhancer, which 
suggests potential auto-regulation of FOXC2 by its protein product. 

30 

A third region containing a local maxima for conservation and a segment exceeding 70% 
identity is present at the 3* end of the published exon sequence within the 3' UTR. This 
conserved region may have a role in mRNA processing. 
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The same analysis was performed with reference to 200 bp contiguous segments of the 
mouse FoxC2 genomic sequence (SEQ ID NO: 5). The following conserved regions were 
identified: 190 to 420; 1070 to 1645; and 5580 to 5875. They correlate to the regions 
5 indicated above for the human sequence and should be considered orthologous regions. 

EXAMPLE 3: Identification of an alternative human FOXC2 cDNA sequence 

10 BLASTN screening of the dbEST database from GenBank, using the human FOXC2 

cDNA (SEQ ID NO: 9) as a query sequence, revealed several ESTs overlapping containing 
portions of the available cDNA. A specialized tool, est_genome (http./Avww.sanger.ac.uk), 
for the prediction of exon boundaries using ESTs was applied to compare the EST 
sequences to the genomic sequences (See Mott, R. (1997) Computer Applications in the 

15 Biosciences 13(4): 477-478). Two classes of ESTs were observed: sequences extending 
into the 3' -untranslated region and sequences revealing an alternative first exon spliced to 
a junction internal to the previously described first exon. 

Specifically, it was found that the nucleotides in positions 33 to 182 in the EST with 
20 accession no. AW271272 (SEQ ID NO: 1 1) were identical to positions 66 to 215 in the 
extended FOXC2 genomic sequence (SEQ ID NO: 1), and that positions 183 to 327 in 
SEQ ID NO: 1 1 were identical to positions 2516 to 2660 in SEQ ID NO: 1. Similarly, 
positions 5 to 55 in the EST with accession no. AW793237 (SEQ ID NO: 12) were 
identical to positions 165 to 215 in the extended FOXC2 genomic sequence (SEQ ID NO: 
25 1), and positions 56 to 157 in SEQ ID NO: 12 were identical to positions 2516 to 2607 in 
SEQ ID NO: 1 . These results revealed an alternative splicing pattern in the human FOXC2 
gene. According to this splicing pattern, an alternative gene sequence (SEQ ID NO: 3) is 
derived by joining the regions shown as positions 1-21 5 and 25 16-6458 in SEQ ID NO: 1 . 
Alternative splicing patterns are known to regulate the synthesis of a variety of peptides 
30 and proteins. It may result in proteins with an entirely different function or in dysfunctional 
or inhibitory splice products (for a review, see McKeown (1992) Annu. Rev. Cell. Biol. 8: 
133-155). 
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The amino acids corresponding to positions 1 to 94 in the published FOXC2 transcription 
factor (SEQ ID NO: 10) are missing in protein encoded by the spliced variant generated 
from the alternative promoter (SEQ ID NO: 4). Consequently, the entire region N-terminal 
of the DNA binding domain and a portion of the DNA-binding domain (corresponding to 
5 positions 72-94 in SEQ ID NO: 2) are not present in the splice variant. It is postulated that 
this truncation leads to a protein which has a deficient "forkhead" DNA-binding region, 
and thus has a potential inhibitory function on the biological activities of the FOXC2 
protein. This truncated FOXC2 protein may have a role in regulation of FOXC2, and an 
involvment in adipocyte differentiation and adipogenesis. 

10 

EXAMPLE 4: Cloning and sequencing of the FOXC2 promoter 

The DNA region corresponding to nucleotide 176 to nucleotide 2233 (SEQ ID NO. 1 
15 version 2) has been cloned using nested PCR on human genomic DNA. The PGR was 
performed according the Herculase™ protocol (Stratagene catalog #600260; 
http://www.stratagem.com/pcr/herculase.htm) and with the inclusion of 8-10% DMSO. 

In the initial reaction, the 5'-primer KRKX131 (CCATTGCCTTCTAGTCGCCTCC) was 
20 used together with the S'-primer KRKX1 33 (CGTTGGGGTCGGACACGGAGTA) using 
250 ng Clontech Genomic DNA # 6550-1 as template. The nested reaction was performed 
on 1/100 of the initial PCR reaction using the 5'-primer KRKX132 
(GGTACCTACGCAGCCGATGAACAGCCA) and the 3*-primer KRKX134 
(GCTAGCGCTGCTTCCGAGACGGCTCG). After the second PCR, the product was 
25 analyzed by electrophoresis in a 1 .2% agarose gel, and a PCR product of the expected size 
was obtained and extracted for ligation into a TOPO PCR2.1 vector (Invitrogen, Carlsbad, 
C A) by standard cloning procedures and thereafter sequenced. The PCR reaction and 
cloning procedure was repeated in two parallel separate experiments, and sequence data 
from the two separate reactions were compared with the bioinformatically assembled 
30 sequence. 

A DNA region containing the promoter (Fig. 4) corresponding to ntl 179 to 2233 (SEQ ID 
NO: 1, version 2) was has been cloned using nested PCR in the same manner as described 



00298-SE 



-14- 

above. In the initial reaction, the 5' -primer KRKX136 

(GGTACCCCCCGAGCCTGGAAACTCCCT) was used together with the 3*-primer 
KRKX134 (GCTAGCGCTGCTTCCGAGACGGCTCG) using 250 ng genomic DNA as a 
template. The PGR reaction and cloning procedure was repeated in four parallel separate 
5 experiments, and sequence data from the four separate reactions were compared with the 
bioinformatically assembled sequence. 

EXAMPLE 5: Tissue expression profiling of the alternative transcript 

10 

Tissue expression profiling of the alternative transcript (SEQ ID NO: 3) is performed using 
standard Northern blotting procedures. RNA samples from an array of human tissues, 
including adipose tissue, are analyzed using an RNA or DNA probe specific for the 
alternative transcript. The expression profile in adipose tissue could be indicative a 
1 5 putative regulatory function for the alternative gene product (SEQ ID NO: 4) on 
adipogenesis and adipocyte differentiation. 

In addition, reverse transcriptase PCR (RT-PCR) according to standard procedures is used 
to detect very low level expression of the alternative transcript in adipose tissue. RNA is 
20 prepared from human adipose tissue, and RT-PCR is performed using PCR primers 
specific for the alternative transcript. 

EXAMPLE 6: Mapping of the 5'-edge of the alternative exon by RACE-PCR 

RNA is prepared from human adipose tissue using standard protocols. RACE (Rapid 
Amplification of cDNA Ends) PCR is performed using the SMART™ RACE cDNA 
Amplification Kit (Clontech catalogue No. K181 1-1; http://www.clontech.com/product/ 
catalog/PCR/smartrace.html), With this procedure, the first strand synthesis produces 
cDNA with an extension containing a known sequence. Due to the mechanism of the 
extension procedure, the extension is typically added only to complete first strand cDN As. 
The 5 '-RACE PCR is then performed using the 5 '-primer provided with the kit, together 
with a 3' -primer corresponding to positions 210-237 in SEQ ID NO: 3 
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(GAACTGGTAGATGCCGTTC AAGGTTTCC) specific for the alternative transcript. The 
PCR product is cloned into a cloning vector and sequenced using standard protocols. 

5 EXAMPLE 7: Functional analysis 

The identified regulatory regions are analyzed to determine their impact on the 
transcription of the FOXC2 gene or a reporter gene substituted for FOXC2. A PCR 
reaction is performed to isolate the promoter region adjacent to the published exon 
10 sequence* possibly including the sequences extending to the beginning of the ATG 

encoding the first methionine. This PCR product is cloned into a reporter plasmid adjacent 
to a reporter gene (e.g. luciferase). The upstream regulatory region, i.e. regions containing 
both upstream and promoter proximal sequences, or these sequences bearing artificially 
induced differences, are cloned in a similar manner. These constructs are transfected into a 
15 cell culture model system and the level/activity of the protein encoded by the reporter gene 
is determined. This would provide information on the function of the identified regions, 
and used to assess the impact of the different regions on transcriptional regulation. 
Similarly, the upstream regulatory region, a region containing both upstream and promoter 
proximal sequences, or these sequences bearing artificially induced differences can be 
20 cloned and used to assess the impact of these regions on the transcription of the reporter 
gene. 

EXAMPLE 8: Reporter gene assay to identify modulating compounds 

Reporter gene assays are well known as tools to signal transcriptional activity in cells. (For 
a review of chemiluminescent and bioluminescent reporter gene assays, see Bronstein et at. 
(1994) Analytical Biochemistry 219, 169-181.) For instance, the photoprotein luciferase 
provides a useful tool for assaying for modulators of promoter activity. Cells are 
transiently transfected with a reporter construct which includes a gene for the luciferase 
protein downstream from the FOXC2 promoter and enhancer region, or fragments thereof 
regulating the FOXC2 activity. Luciferase activity may be quantitatively measured using 
e.g. luciferase assay reagents that are commercially available from Promega (Madison, 




C029&-SE 

-16- 

WI). Differences in luminescence in the presence versus the absence of a candidate 
modulator compound are indicative of modulatory activity. 
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TABLEI 

Summary of FOXC2 sequences 



SEQ ID NO: 


GenBank 
accession no. 


Description 


. 




Human FOXC2 extended genomic DNA sequence 


2 




Human FOXC2 polypeptide sequence 
(identical witn bcA^ Wj inu. iU) 


3 




Human FOXC2 DNA sequence 
Alternative splicing 


4 




Human polypeptide sequence 
Alternative open reading frame 


5 


Y08222 


Mouse MHF-1 (FcxC2) genomic DNA sequence 
(LDa 2070 — 3554) 


6 


X74040 


Mouse MHF-1 (FaxC2) cDNA sequence 


7 




Mouse MHF-1 (FaxC2) polypeptide sequence 


8 


Y08223 


Human FKHLI4 (FOXC2) genomic DNA sequence 
(CDS 1 197 - 2702) 


9 


NMJ)05251 


Human FKHLI4 (FOXC2) cDNA sequence . 


10 




Human FKHLI4 (FOXC2) polypeptide sequence 


11 


AW 271272 


Human EST 


12 


AW 793237 


Human EST 



TABLE II 

Summary of features in human FOXC2 sequences shown as SEQ ID NOs: 1 and 3 



Feature 


Positions 


SEQ ID NO: 1 


First exon according to the alternative transcript 


1 -215 


- Untranslated region 


1-186 


- Region coding for 5* -part of alternative protein 


187-215 


Alternative first exon splice site 


215-216 


Predicted enhancer region 


216-475 


- E-box-like region 


223-231 


- Forkhead-Like region 


359-375 


- Forkhead-Like region 


378-402 


- Ets-like region 


403-423 


Predicted promoter region 


1250-1749 


- Forkhead-like region 


1692-1703 


First exon according to the published form of the transcript 


1746-4629 


- Untranslated region 


1746-2234 


- Polypeptide coding region 


2235-3740 


- Region coding for DNA-binding domain 


2448-2735 


Second exon according to the alternative transcript 


2516-4629 


- Portion of polypeptide used in alternative transcript 


2516-3740 


- Untranslated region 


3741-4629 


SEQ ID NO: 3 


Polypeptide coding region (5' of splice site) 


187-215 


Polypeptide coding region (3* of splice site) 


216-1437 


- Region coding for truncated portion of protein 


216-435 



A human FOXC2 promoter region comprising a sequence selected from: 

(a) the nucleotide sequence set forth as positions 1692 to 1703 in SEQ ED 
NO: 1 , or a fragment thereof exhibiting FOXC2 promoter activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 

The human FOXC2 promoter region according to claim 1 , comprising a 
sequence selected from: 

(a) the nucleotide sequence set forth as positions 1250 to 1749 in SEQ ID 
NO: 1, or a fragment thereof exhibiting FOXC2 promoter activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 

The human FOXC2 promoter region according to claim 2, comprising a 
sequence selected from: 

(a) the nucleotide sequence set forth as positions 1250 to 2235 in SEQ ID 
NO: 1, or a fragment thereof exhibiting FOXC2 promoter activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 

A recombinant construct comprising the human FOXC2 promoter region according 
to any one of claims 1 to 3, 

The recombinant construct according to claim 4 wherein the human FOXC2 
promoter region is operably linked to a gene encoding a detectable product. 

The recombinant construct according to claim 5 wherein said gene encoding a 
detectable product is a human FOXC2 gene. 
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7. The recombinant construct according to claim 4 further comprising a reporter gene. 

8. A vector comprising the recombinant construct according to any one of claims 4 to 7. 

5 

9. A host cell stably transformed with the recombinant construct according to any one 
of claims 4 to 7. 

1 0. A method for identification of an agent regulating FOXC2 promoter activity, said 
i o method comprising the steps 

(i) contacting a candidate agent with a human FOXC2 promoter region as defined in 
any one of claims 1 to 3; and 

(ii) determining whether said candidate agent modulates expression of the FOXC2 
gene, such modulation being indicative for an agent capable of regulating FOXC2 

!5 promoter activity. 

11. A method for identification of an agent regulating FOXC2 promoter activity, said 
method comprising assaying reporter gene expression in a cell according to claim 9 
in the presence and absence of a candidate agent, wherein an effect on the test level 

20 of expression as compared to control level of expression is indicative of an agent 

capable of regulating FOXC2 promoter activity. 

12. A human FOXC2 enhancer region comprising a sequence selected from: 

(a) the nucleotide sequence set forth as positions 223 to 231 in SEQ ID NO: 1, 
or a fragment thereof exhibiting FOXC2 enhancer activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 

13. A human FOXC2 enhancer region comprising a sequence selected from: 

(a) the nucleotide sequence set forth as positions 359 to 375 in SEQ ID NO: 1, 
or a fragment thereof exhibiting FOXC2 enhancer activity; 

(b) the complementary strand of (a); and 
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(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 

14. A human FOXC2 enhancer region comprising a sequence selected from: 

(a) the nucleotide sequence set forth as positions 378 to 402 in SEQ ID NO: 1, 
or a fragment thereof exhibiting FOXC2 enhancer activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 



15. A human FOXC2 enhancer region comprising a sequence selected from: 

(a) the nucleotide sequence set forth as positions 403 to 423 in SEQ ID NO: 1, 
or a fragment thereof exhibiting FOXC2 enhancer activity; 

(b) the complementary strand of (a); and 

l s (c) nucleotide sequences capable of hybridizing, under stringent hybridization 

conditions, to a nucleotide sequence as defined in (a) or (b). 

1 6. The human FOXC2 enhancer region according to any one of claims 12 to 1 5 
comprising a sequence selected from: 

20 (a) the nucleotide sequence set forth as positions 216 to 475 in SEQ ID NO: 1 , 

or a fragment thereof exhibiting FOXC2 enhancer activity; 

(b) the complementary strand of (a); and 

(c) nucleotide sequences capable of hybridizing, under stringent hybridization 
conditions, to a nucleotide sequence as defined in (a) or (b). 



17. A recombinant construct comprising a human FOXC2 enhancer region according to 
any one of claims 12 to 15. 

18. A vector comprising the recombinant construct according to claim 1 7 . 

19. A host cell stably transformed with the recombinant construct according to claim 1 8. 
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A method for identification of an agent regulating FOXC2 enhancer activity, said 
method comprising the steps 

(i) contacting a candidate agent with the human FOXC2 enhancer region as defined 
in any one of claims 12 to 16; and 

(ii) determining whether said candidate agent modulates expression of the FOXC2 
gene, such modulation being indicative for an agent capable of regulating FOXC2 
enhancer activity. 

A method for identification of an agent capable of regulating FOXC2 enhancer 
activity, said method comprising assaying reporter gene expression in a cell as 
defined in claim 1 9 in the presence and absence of a candidate agent, wherein an 
effect on the test level of expression as compared to control level of expression is 
indicative of an agent capable of regulating FOXC2 enhancer activity. 

22. A method for identification of an agent capable of regulating a mammalian FOXC2 
promoter activity, said method comprising the steps 

(i) contacting a candidate agent with a murine FoxC2 promoter nucleotide sequence 
shown as positions 1250 to 2235 in SEQ ID NO: 5; and 

(ii) determining whether said candidate agent modulates expression of a mammalian 
FOXC2 gene, such modulation being indicative for an agent capable of regulating 
mammalian FOXC2 promoter activity. 

23 . A method for identification of an agent capable of regulating a mammalian FOXC2 
enhancer activity, said method comprising the steps 

25 (i) contacting a candidate agent with a murine FoxC2 enhancer nucleotide sequence 

shown as positions 216 to 475 in SEQ ID NO: 5; and 

(ii) determining whether said candidate agent modulates expression of a mammalian 
FOXC2 gene, such modulation being indicative for an agent capable of regulating 
mammalian FOXC2 enhancer activity. 

30 

24. A method for identification of an agent capable of regulating a mammalian FOXC2 
enhancer activity, said method comprising the steps 



20. 



5 



21. 

10 



15 



20 



00298-SE 



-23- 



(i) contacting a candidate agent with a murine FoxC2 enhancer nucleotide sequence 
shown as positions 216 to 2235 in SEQ ID NO: 5; and 

(ii) determining whether said candidate agent modulates expression of a mammalian 
FOXC2 gene, such modulation being indicative for an agent capable of regulating 

5 mammalian FOXC2 enhancer activity. 



25. An isolated nucleic acid molecule selected from: 

(a) nucleic acid molecules comprising a nucleotide sequence as shown in SEQ ID 
NO: 3; 

io (b) nucleic acid molecules comprising a nucleotide sequence capable of hybridizing, 

under stringent hybridization conditions, to a nucleotide sequence complementary 
the polypeptide coding region of a nucleic acid molecule as defined in (a) and which 
codes for a variant form of the FOXC2 transcription factor; and 
(c) nucleic acid molecules comprising a nucleic acid sequence which is degenerate as 

15 a result of the genetic code to a nucleotide sequence as defined in (a) or (b) and 

which codes for a variant form of the FOXC2 transcription factor. 

26. An isolated polypeptide encoded by the nucleic acid according to claim 25. 

20 27. The isolated polypeptide according to claim 26 having an amino acid sequence 
shown as SEQ ID NO: 4 in the Sequence Listing 



28 . A vector harboring the nucleic acid molecule according to claim 25 . 



25 29. A replicable expression vector which carries and is capable of mediating the 
expression of a nucleotide sequence according to claim 25. 

30. A cultured host cell harboring a vector according to claim 28 or 29. 

30 31. A process for production of a variant form of the FOXC2 transcription factor 

polypeptide, comprising culturing a host cell according to claim 30 under conditions 
whereby said polypeptide is produced, and recovering said polypeptide. 
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32. A method for identifying an agent capable of regulating expression of the nucleic 
acid molecule according to claim 25, said method comprising the steps 

(i) contacting a candidate agent with the said nucleic acid molecule; and 

(ii) determining whether said candidate agent modulates expression of the said 
s nucleic acid molecule. 

33. An antisense oligonucleotide having a sequence capable of specifically hybridizing 
to RNA transcribed by the nucleic acid molecule according to claim 25, so as to 
prevent translation of the said RNA. 
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ABSTRACT 



The present invention relates an isolated promoter region of the mammalian transcription 
factor FOXC2. The invention also relates to screening methods for agents modulating the 
expression of FOXC2 and thereby being potentially useful for the treatment of medical 
conditions related to obesity. The invention further relates to a previously unknown variant 
of the human FOXC2 gene, derived via the use of an alternative promoter, which produces 
an additional exon that generates a distinct open reading frame via splicing. The alternative 
gene encodes a variant of the FOXC2 transcription factor, which is lacking a part of the 
DNA-binding domain and consequently has a potential regulatory function. 
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SEQUENCE LISTING 

<110> Pharmacia & Upjohn AB 

<120> Promoter Sequences 

<130> 00298 

<140> 
<141> 

<160> 12 

<170> Patentln Ver. 2.1 

<210> 1 

<211> 6458 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 

<222> (2235) . . (3740) 

<400> 1 _ - _ f , . . . . ' 

cctttggctt tgaattgatc aggagacaaa gataatgcat ctacattttc gtcttctgtt 60 
cttttattgg aaataagtgg cacgccccat tgccttctag tcgcctcccc gaagcgaaga 120 
ggccgaagcg aagaggcctg gtgggttgtc tcaacatcct tttgctgaga atcgaatacg 180 
cagccgatga acagccagga agggtgcaag gaaacctgaa atacaaatgt tctccctgaa 240 
gccctcttcc ctgcccaacc agaccagcaa cttccaaaat tctgcccgtg tttagccttg 300 
ttaaaggggt gtctcactcc ttcagggaaa gtgggaaaag gggatctgat tattgaggtg 360 
tggaaggaat aaataatcag tccacaaata aacaaactgt ccgggattcc tagagggaag 420 
gagaaatcct tgaaggagat ccaagtcgct ccaggtctgc ctgccgaata atatcatccc 480 
gaagggatct tgaaccgttt gcaatcaacc gctcacccag tcttcccacg gagcgcgctc 540 
cctaactcac cctacccacc caacaaaaca aaaaaaaggc tgaaatatag aaaagcaact 600 
tggaggctcc cagggggacg ttgccaggag caggaggcag ggacagcgcc ctagggtcgg 660 
tgttagcggc cggcgccggc ctgggccacg ggaaacgtcc acgcttggtg cccgcggtgc 720 
gcggcgctca ttgcgcgcgc cttcgagcca agcccccgcg gaaaacaggc tcgggtttct 780 
cctcgcaggg cccaggaact cggctctgcc tggcccgggt gggtcgctgc attgtcccgg 840 
tcttctggga gtgcggggtc agcttgttag agggaatttc tacctgggaa aagggagacg 900 
agtttcgaag ctgaagttgg taggctgcga gtgtccacgc gggagacgaa agggggaaat 960 
agcagagtca cttcaccctt ttccccaaac cccacaaaac tgctcgcagc gacgcggatg 1020 
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atctaccgaa 


ttccccgcga 


atteggagga 


ttaagttgtc 


agtcagcacg 


ttgetacett 


1080 


cccctctatg 


cactccgctg 


cctggctcct 


eggeggggag 


cgagggaaac 


tcagtttgta 


1140 


gggtttacct 


ctaaaacctc 


gataggttat 


ccttgacgac 


cccgagcctg 


gaaactccct 


1200 


gttgatgatt 


aattatttga 


ttaaataagt 


ataacatcca 


ggagaggece 


tgccattcca 


1260 


atccagcgcg 


tttgcttttg 


aatccattac 


acctgggccc 


ccataattag 


gaaatctaat 


1320 


tattcgcttc 


atcactcatt 


aataagaaaa 


atgtcccagg 


ateattgeta 


cttacaaggt 


1380 


ctttgggaga 


gatattttac 


tctattaatc 


cattctattt 


tatatttcaa 


attgattttt 


1440 


tttaacagag 


gaaagtggct 


atctttttgt 


tttgggcatg 


tgggeccatt 


caccaaaatg 


1500 


tgatcataaa 


ataaatttta 


ataagatata 


actttttaaa 


aagttttcaa 


gtgaagaegg 


1560 


agtcgccgcg 


gaggccgggg 


eggeggggtc 


ttagagcega 


eggattcctg 


cgctcctcgc 


1620 


cccgattggc 


gccggactcc 


tctcagctgc 


cgggtgattg 


gctcaaagtt 


ccgggagggg 


1680 


gcgtggcccg 


aggaaagtaa 


aaactcgett 


tcagcaagaa 


gacttttgaa 


acttttccca 


1740 


atccctaaaa 


gggacttggc 


ctctttttct 


gggctcagcg 


gggcagccgc 


tcggaccccg 


1800 


gcgcgctgac 


cctcggggct 


gccgattcgc 


tgggggcttg 


gagagcctcc 


tgcgcccctc 


18 60 


ctcgcgcggg 


ccgagggtcc 


accttggtcc 


ccaggccgcg 


gcgtctccgc 


tgggtccgcg 


1920 


gccgcccgcc 


tgcccgcgct 


gccgccgccg 


ggtcctggag 


ecagegagga 


geggggcegg 


1980 


cgctgcgctt 


gcccggggcg 


cgccctccag 


gatgecgate 


cgcccggtcc 


getgaaageg 


2040 


cgcgcccctg 


ctcggcccga 


gegacgaega 


ccgcgcaccc 


tcgccccgga 


ggctgecagg 


2100 


agaccggggc 


cgcccctccc 


gctcccctcc 


tctccccctc 


tggctctctc 


gcgctctctc 


2160 


gctctcaggg 


cccccctcgc 


tcccccggcc 


geagtcegtg 


cgcgagggcg 


ccggcgagcc 


2220 


gtctcggaag 


cage atg cag gcg cgc tac tec gtg 
Met Gin Ala Arg Tyr Ser Val 


tec gac ccc aac gec 
Ser Asp Pro Asn Ala 


2270 



15 10 

ctg gga gtg gtg ccc tac ctg age gag cag aat tac tac egg get gcg 2318 
Leu Gly Val Val Pro Tyr Leu Ser Glu Gin Asn Tyr Tyr Arg Ala Ala 
15 20 25 

ggc age tac ggc ggc atg gee age ccc atg ggc gtc tat tec ggc cac 2366 
Gly Ser Tyr Gly Gly Met Ala Ser Pro Met Gly Val Tyr Ser Gly His 
30 35 40 

ccg gag cag tac age gcg ggg atg ggc cgc tec tac gcg ccc tac cac 2414 
Pro Glu Gin Tyr Ser Ala Gly Met Gly Arg Ser Tyr Ala Pro Tyr His 
45 50 55 60 

cac cac cag ccc gcg gcg cct aag gac ctg gtg aag ccg ccc tac age 24 62 
His His Gin Pro Ala Ala Pro Lys Asp Leu Val Lys Pro Pro Tyr Ser 
65 70 75 
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tac ate gcg etc ate acc atg gec ate cag aac gcg ccc gag aag aag 2510 
Tyr lie Ala Leu lie Thr Met Ala lie Gin Asn Ala Pro Glu Lys Lys 
80 85 90 

ate acc ttg aac ggc ate tac cag ttc ate atg gac cgc ttc ccc ttc 2558 
lie Thr Leu Asn Gly He Tyr Gin Phe He Met Asp Arg Phe Pro Phe 
95 100 105 

tac egg gag aac aag cag ggc tgg cag aac age ate cgc cac aac etc 2606 
Tyr Arg Glu Asn Lys Gin Gly Trp Gin Asn Ser He Arg His Asn Leu 
110 115 120 

teg etc aac gag tgc ttc gtc aag gtg ccc cgc gac gac aag aag ccc 2654 
Ser Leu Asn Glu Cys Phe Val Lyc Val Pro Arg Asp Asp Lys Lys Pro 
125 130 135 140 

ggc aag ggc agt tac tgg acc ctg gac ccg gac tec tac aac atg ttc 2702 
Gly Lys Gly Ser Tyr Trp Thr Leu Asp Pro Asp Ser Tyr Asn Met Phe 
145 150 155 

gag aac ggc age ttc ctg egg cgc egg egg cgc ttc aaa aag aag gac 2750 
Glu Asn Gly Ser Phe Leu Arg Arg Arg Arg Arg Phe Lys Lys Lys Asp 
160 165 170 

gtg tec aag gag aag gag gag egg gec cac etc aag gag ccg ccc ccg 2798 
Val Ser Lys Glu Lys Glu Glu Arg Ala His Leu Lys Glu Pro Pro Pro 
175 180 185 

gcg gcg tec aag ggc gec ccg gee acc ccc cac eta gcg gac gec ccc 2846 
Ala Ala Ser Lys Gly Ala Pro Ala Thr Pro His Leu Ala Asp Ala Pro 
190 195 200 

aag gag gec gag aag aag gtg gtg ate aag age gag gcg gcg tec ccg 2894 
Lys Glu Ala Glu Lys Lys Val Val He Lys Ser Glu Ala Ala Ser Pro 
205 210 215 220 

gcg ctg ccg gtc ate acc aag gtg gag acg ctg age ccc gag age gcg 2942 
Ala Leu Pro Val He Thr Lys Val Glu Thr Leu Ser Pro Glu Ser Ala 
225 230 235 

ctg cag ggc age ccg cgc age gcg gec tec acg ccc gec ggc tec ccc 2990 
Leu Gin Gly Ser Pro Arg Ser Ala Ala Ser Thr Pro Ala Gly Ser Pro 
240 245 250 

gac ggt teg ctg ccg gag cac cac gec gcg gcg ccc aac ggg ctg cct 3038 
Asp Gly Ser Leu Pro Glu His His Ala Ala Ala Pro Asn Gly Leu Pro 
255 260 265 

ggc ttc age gtg gag aac ate atg acc ctg cga acg teg ccg ccg ggc 3086 
Gly Phe Ser Val Glu Asn He Met Thr Leu Arg Thr Ser Pro Pro Gly 
270 275 280 

gga gag ctg age ccg ggg gee gga cgc gcg ggc ctg gtg gtg ccg ccg 3134 
Gly Glu Leu Ser Pro Gly Ala Gly Arg Ala Gly Leu Val Val Pro Pro 
285 290 295 300 
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ctg gcg ctg cca tac gcc gcc gcg ccg ccc gcc gcc tac ggc cag ccg 3182 
Leu Ala Leu Pro Tyr Ala Ala Ala Pro Pro Ala Ala Tyr Gly Gin Pro 
305 310 315 

tgc get cag ggc ctg gag gcc ggg gcc gcc ggg ggc tac cag tgc age 3230 
Cys Ala Gin Gly Leu Glu Ala Gly Ala Ala Gly Gly Tyr Gin Cys Ser 
320 325 330 

atg cga gcg atg age ctg tac acc ggg gcc gag egg ccg gcg cac atg 3278 
Met Arg Ala Met Ser Leu Tyr Thr Gly Ala Glu Arg Pro Ala His Met 
335 340 345 

tgc gtc ccg ccc gcc ctg gac gag gcc etc teg gac cac ccg age ggc 3326 
Cys Val Pro Pro Ala Leu Asp Glu Ala Leu Ser Asp His Pro Ser Gly 
350 355 360 

ccc acg teg ccc ctg age get etc aac etc gcc gcc ggc cag gag ggc 3374 
Pro Thr Ser Pro Leu Ser Ala Leu Asn Leu Ala Ala Gly Gin Glu Gly 
365 370 375 380 

gcg etc gcc gcc acg ggc cac cac cac cag cac cac ggc cac cac cac 3422 
Ala Leu Ala Ala Thr Gly His His His Gin His His Gly His His His 
385 . 390 395 

ccg cag gcg ccg ccg ccc ccg ccg get ccc cag ccc cag ccg acg ccg 3470 
Pro Gin Ala Pro Pro Pro Pro Pro Ala Pro Gin Pro Gin Pro.Thr E|ro. 
400 405- * 410 

cag ccc ggg gcc gcc gcg gcg cag gcg gcc tec tgg tat etc aac cac 3518 
Gin Pro Gly Ala Ala Ala Ala Gin Ala Ala Ser Trp Tyr Leu Asn His 
415 420 425 

age ggg gac ctg aac cac etc ccc ggc cac acg ttc gcg gcc cag cag 3566 
Ser Gly Asp Leu Asn His Leu Pro Gly His Thr Phe Ala Ala Gin Gin 
430 435 440 

caa act ttc ccc aac gtg egg gag atg ttc aac tec cac egg ctg ggg 3614 
Gin Thr Phe Pro Asn Val Arg Glu Met Phe Asn Ser His Arg Leu Gly 
445 450 455 460 

att gag aac teg acc etc ggg gag tec cag gtg agt ggc aat gcc age 3662 
lie Glu Asn Ser Thr Leu Gly Glu Ser Gin Val Ser Gly Asn Ala Ser 
465 470 475 

tgc cag ctg ccc tac aga tec acg ccg cct etc tat cgc cac gca gcc 3710 
Cys Gin Leu Pro Tyr Arg Ser Thr Pro Pro Leu Tyr Arg His Ala Ala 
480 485 490 

ccc tac tec tac gac tgc acg aaa tac tga cgtgtcccgg gacctcccct 37 60 

Pro Tyr Ser Tyr Asp Cys Thr Lys Tyr 
495 500 

ccccggcccg ctccggcttc gcttcccagc cccgacccaa ccagacaatt aaggggctgc 3820 

agagaegcaa aaaagaaaca aaacatgtcc accaaecttt tctcagaccc gggagcagag 3880 

agegggcacg ctagccccca gccgtctgtg aagagcgcag gtaactttaa ttcgccgccc 394 0 

cgtttctggg atcccaggaa acccctccaa agggaegcag cccaacaaaa tgagtattgg 4000 
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tcttaaaatc 


cccctcccct 


accaggacgg 


ctgtgctgtg 


ctcgacctga 


gctttcaaaa 


4060 




gttaagttat 


ggacccaaat 


cccatagcga 


gcccctagtg 


actttctgta 


ggggtcccca 


4120 




taggtgtatg 


ggggtctcta 


tagataatat 


atgtgctgtg 


tgtaatttta 


aatttctcca 


4180 




accgtgctgt 


acaaatgtgt 


ggatttgtaa 


tcaggctatt 


ttgttgttgt 


tgttgttgtt 


4240 




cagagccatt 


aatataatat 


ttaaagttga 


gttcactgga 


taagtttttc 


atcttgccca 


4300 




accatttcta 


actgccaaat 


tgaattcaag 


aaaccgatgt 


gggttttgtt 


tcctgtacaa 


4360 




ttatgagata 


taattctttt 


tcccattgta 


ggtcttttac 


aaaacaagaa 


aataatttat 


4420 




ttttttgttg 


gtggataaag 


aagtcaagta 


tctgatactt 


tttatttaca 


aagtgtgatg 


4480 




gttttgtata 


gtaggttcca 


ccctgagtat 


tcctaaaaga 


aaaaaaaaaa 


aaaagcttaa 


4540 




aaactctaac 


ttcatctgtg 


tttgtcttac 


gtggtcttaa 


tcgttgtact 


taccttaaaa 


4600 




taaacccatg 


ttgttttttc 


tgcccaaagt 


ttggacagtg 


tgtttgtgtt 


gttgcatttt 


4660 




ttacaaacga 


ggtgtgtttg 


caaacccacc 


tgctttgatt 


atttttgtta 


cacaggtggg 


4720 




tatatgtgta 


gacacataaa 


aacgaccaga 


gaataggagc 


acacacctgc 


tgtcttgttt 


4780 




agtgacagaa 


aaaggctttt 


gattaatttt 


aaaatcccac 


tctaggattt 


tttcttttcg 


4840 




agaaaccgcc 


cagttggagg 


gggctgcctg 


aaggaccgga 


ccatgagttt 


gccgtgatgc 


4900 




attttcttaa 


atgcacaaaa 


acatgctaat 


tgtcaaaaca 


aacagtgcca 


ctccatctca 


4960 




gtgtccagcc 


gtccccagtt 


taggaggtga 


aggaagggaa 


gaataaacat 


ttcccgtttg 


5020 




ctaactgcaa 


cccagggtga 


gtcctgcttt 


cccccgattt 


tataaaattt 


gagcctcttt 


5080 




gcctgcttta 


atagttttcc 


agagaatttg 


aactgggcca 


atgaaggtct 


gaaggggacg 


5140 




gattttctag 


cgtttgatat 


ccatccccct 


tagcggccag 


atcagagggg 


aatttcagac 


5200 




tttattactt 


ctcaatgtca 


tgtctaaatc 


tacaccctca 


tcgcagtgaa 


aaattttaaa 


5260 




acctcattac 


ccttcaaaaa 


taatttatga 


tatttttaga 


gttctaaatt 


caagtttttc 


5320 




aatatgttaa 


ataatagaga 


ttattttttg 


ttttcaatgt 


taatatctcg 


tcttttacat 


5380 


r 


ttttaatagt 


aacatagttt 


ttgtgaaatg 


tagctgacga 


aatggcttta 


ttatctattt 


5440 


r 


caatggctga 


agtccaccac 


tcccctgctg 


gcctctatgt 


gtgaatttgg 


ggaccaaagc 


5500 




ttcatcaatt 


cccaccccag 


caggtgagct 


gtaccttgct 


aatgctgaag 


ttctttgtga 


5560 


* t 


gcttaacgtt 


tcaagaccag 


atgattttgc 


taaaggtgat 


tttgcttgat 


gcagtggcgc 


5620 




tgaacgtaac 


ccgggtgttt 


ttgtcgtgtt 


gttttcaaca 


tggcacttta 


tctccacgct 


5680 




atgttgaaat 


agaattaggg 


gaagcttaaa 


gcataataat 


tgtccccaca 


tgtgcaacac 


5740 
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agactctttc 


aatctgtggc 


cccagaggtg 


gcacacagtt 


aagacttggc 


ggctgtctca 


5800 


ttctttttca 


taatgtgcgg 


gttcccgggt 


gtccgggtgc 


tagactttca 


gcaggcccca 


5860 


ggccagacgg 


gctttggttg 


agtgaacagg 


aggaggaagt 


taaggaggta 


ggggtgggga 


5920 


gagaccctct 


ccaagctgca 


gaagaaggtg 


gcccaagctc 


cttgcctgcg 


tctgccgtga 


5980 


tggtt tcatt 


ttacttctgc 


tcgcttcatg 


ctatttgccc 


caggagaaga 


ggagagtatt 


6040 


ccagacggta 


agcgagctgg 


ctttttccct 


tccctagacg 


ttttaaagaa 


atctttctga 


6100 


aagcttgccc 


tcatcgtaag 


ctttgaaacc 


gttggtgtcc 


tgttagtggc 


gagggctgag 


6160 


agacacgcgg 


agaaataaag 


gagagcgacg 


gtgtggctga 


gagcccccag 


gtctgctgtt 


6220 


gaaactaagc 


tgggcttttg 


cacctttagg 


aagccttttt 


aaagaagtcc 


tgctgtgtgg 


6280 


gggccggaag 


cccaagtgag 


tgggccttgt 


ggaggttatc 


gggaggggtc 


tttaccactc 


6340 


cttggggaac 


gtgggcaacg 


gggggattgt 


atctgaagct 


ttattcaggt 


cttcggcggc 


6400 


agcagagtgg 


agaaccaggc 


ccttagtgtg 


tagcggcctg 


gggattttgg 


gactcatc 


6458 



<210> 2 
<211> 501 
<212> PRT 

<213> Homo sapiens 



<400> 2 





Met 


Gin 


Ala 


Arg 


Tyr 


Ser 


Val 


Ser 


Asp 


Pro 


Asn 


Ala 


Leu 


Gly 


Val 


Val 




1 








5 










10 










15 






Pro 


Tyr 


Leu 


Ser 


Glu 


Gin 


Asn 


Tyr 


Tyr 


Arg 


Ala 


Ala 


Gly 


Ser 


Tyr 


Gly 










20 










25 










30 








Gly 


Met 


Ala 


Ser 


Pro 


Met 


Gly 


Val 


Tyr 


Ser 


Gly 


His 


Pro 


Glu 


Gin 


Tyr 








35 










40 










45 










Ser 


Ala 


Gly 


Met 


Gly 


Arg 


Ser 


Tyr 


Ala 


Pro 


Tyr 


His 


His 


His 


Gin 


Pro 






50 










55 










60 












Ala 


Ala 


Pro 


Lys 


Asp 


Leu 


Val 


Lys 


Pro 


Pro 


Tyr 


Ser 


Tyr 


He 


Ala 


Leu 




65 










70 










75 










80 




lie 


Thr 


Met 


Ala 


He 


Gin 


Asn 


Ala 


Pro 


Glu 


Lys 


Lys 


He 


Thr 


Leu 


Asn 












85 










90 










95 






Gly 


He 


Tyr 


Gin 


Phe 


He 


Met 


Asp 


Arg 


Phe 


Pro 


Phe 


Tyr 


Arg 


Glu 


Asn 










100 










105 










110 






* * 


Lys 


Gin 


Gly 


Trp 


Gin 


Asn 


Ser 


He 


Arg 


His 


Asn 


Leu 


Ser 


Leu 


Asn 


Glu 








115 










120 










125 










Cys 


Phe 


Val 


Lys 


Val 


Pro 


Arg 


Asp 


Asp 


Lys 


Lys 


Pro 


Gly 


Lys 


Gly 


Ser 






130 










135 










140 












Tyr 


Trp 


Thr 


Leu 


Asp 


Pro 


Asp 


Ser 


Tyr 


Asn 


Met 


Phe 


Glu 


Asn 


Gly 


Ser 




145 










150 










155 










160 


* 


Phe 


Leu 


Arg 


Arg 


Arg 


Arg 


Arg 


Phe 


Lys 


Lys 


Lys 


Asp 


Val 


Ser 


Lys 


Glu 












165 










170 










175 






Lys 


Glu 


Glu 


Arg 


Ala 


His 


Leu 


Lys 


Glu 


Pro 


Pro 


Pro 


Ala 


Ala 


Ser 


Lys 










180 










185 










190 








Gly 


Ala 


Pro 


Ala 


Thr 


Pro 


His 


Leu 


Ala 


Asp 


Ala 


Pro 


Lys 


Glu 


Ala 


Glu 








195 










200 










205 










Lys 


Lys 


Val 


Val 


He 


Lys 


Ser 


Glu 


Ala 


Ala 


Ser 


Pro 


Ala 


Leu 


Pro 


Val 



210 215 220 
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He 


Thr 


Lys 


Val 


Glu 


Thr 


Leu 


Ser 


Pro 


Glu 


Ser 


Ala 


Leu 


Gin Gly 


Ser 


225 










230 










235 










240 


Pro 


Arg 


Ser 


Ala 


Ala 


Ser 


Thr 


Pro 


Ala 


Gly 


Ser 


Pro 


Asp 


Gly 


Ser 


Leu 










245 










250 










255 




Pro 


Glu 


His 


His 


Ala 


Ala 


Ala 


Pro 


Asn 


Gly 


Leu 


Pro 


Gly 


Phe 


Ser 


Val 








260 










265 










270 






Glu 


Asn 


He 


Met 


Thr 


Leu 


Arg 


Thr 


Ser 


Pro 


Pro 


Gly Gly Glu 


Leu 


Ser 






275 










280 










285 








Pro 


Gly 


Ala 


Gly 


Arg 


Ala 


Gly 


Leu 


Val 


Val 


Pro 


Pro 


Leu 


Ala 


Leu 


Pro 




290 










295 










300 










Tyr 


Ala 


Ala 


Ala 


Pro 


Pro 


Ala 


Ala 


Tyr 


Gly 


Gin 


Pro 


Cys 


Ala 


Gin 


Gly 


305 










310 










315 










320 


Leu 


Glu 


Ala 


Gly 


Ala 


Ala 


Gly 


Gly 


Tyr 


Gin 


Cys 


Ser 


Met 


Arg 


Ala 


Met 










325 










330 










335 




Ser 


Leu 


Tyr 


Thr 


Gly 


Ala 


Glu 


Arg 


Pro 


Ala 


His 


Met 


Cys 


Val 


Pro 


Pro 








340 










345 










350 






Ala 


Leu 


Asp 


Glu 


Ala 


Leu 


Ser 


Asp 


His 


Pro 


Ser 


Gly 


Pro 


Thr 


Ser 


Pro 






355 










360 










365 








Leu 


Ser 


Ala 


Leu 


Asn 


Leu 


Ala 


Ala 


Gly 


Gin 


Glu Gly Ala 


Leu 


Ala 


Ala 




370 










375 










380 










Thr 


Gly 


His 


His 


His 


Gin 


His 


His 


Gly 


His 


His 


His 


Pro 


Gin 


Ala 


Pro 


385 










390 










395 










400 


Pro 


Pro 


Pro 


Pro 


Ala 


Pro 


Gin 


Pro 


Gin 


Pro 


Thr 


Pro 


Gin 


Pro 


Gly 


Ala 










405 










410 










415 




Ala 


Ala 


Ala 


Gin 


Ala 


Ala 


Ser 


Trp 


Tyr 


Leu 


Asn 


His 


Ser 


Gly Asp 


Leu 








420 










425 










430 






Asn 


His 


Leu 


Pro 


Gly 


His 


Thr 


Phe 


Ala 


Ala 


Gin 


Gin 


Gin 


Thr 


Phe 


Pro 






435 










440 










445 








Asn 


Val 


Arg 


Glu 


Met 


Phe 


Asn 


Ser 


His 


Arg 


Leu Gly 


lie 


Glu 


Asn 


Ser 




450 










455 










460 










Thr 


Leu 


Gly 


Glu 


Ser 


Gin 


Val 


Ser 


Gly 


Asn 


Ala 


Ser 


Cys 


Gin 


Leu 


Pro 


4 65 










470 










475 










480 


Tyr 


Arg 


Ser 


Thr 


Pro 


Pro 


Leu 


Tyr 


Arg 


His 


Ala 


Ala 


Pro 


Tyr 


Ser 


Tyr 










485 










4 90 










495 




Asp 


Cys 


Thr 


Lys 


Tyr 

























500 



<210> 3 

<211> 4158 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 

<222> (187) . . (1437) 
<400> 3 

cctttggctt tgaattgatc aggagacaaa gataatgcat ctacattttc gtcttctgtt 60 
cttttattgg aaataagtgg cacgccccat tgccttctag tcgcctcccc gaagcgaaga 120 
ggccgaagcg aagaggcctg gtgggttgtc tcaacatcct tttgctgaga atcgaatacg 180 



cagccg atg 
Met 
1 



aac age cag gaa ggg tgc aag gaa acc ttg aac ggc ate 228 
Asn Ser Gin Glu Gly Cys Lys Glu Thr Leu Asn Gly He 
5 10 
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tac cag ttc ate atg gac cgc ttc ccc ttc tac egg gag aac aag cag 276 
Tyr Gin Phe lie Met Asp Arg Phe Pro Phe Tyr Arg Glu Asn Lys Gin 
15 20 25 30 

ggc tgg cag aac age ate cgc cac aac etc teg etc aac gag tgc ttc 324 
Gly Trp Gin Asn Ser He Arg His Asn Leu Ser Leu Asn Glu Cys Phe 
35 40 45 

gtc aag gtg ccc cgc gac gac aag aag ccc ggc aag ggc agt tac tgg 372 
Val Lys Val Pro Arg Asp Asp Lys Lys Pro Gly Lys Gly Ser Tyr Trp 
50 55 60 

acc ctg gac ccg gac tec tac aac atg ttc gag aac ggc age ttc ctg 420 
Thr Leu Asp Pro Asp Ser Tyr Asn Met Phe Glu Asn Gly Ser Phe Leu 
65 70 75 

egg cgc egg egg cgc ttc aaa aag aag gac gtg tec aag gag aag gag 468 
Arg Arg Arg Arg Arg Phe Lys Lys Lys Asp Val Ser Lys Glu Lys Glu 
80 85 90 

gag egg gec cac etc aag gag ccg ccc ccg gcg gcg tec aag ggc gee 516 
Glu Arg Ala His Leu Lys Glu Pro Pro Pro Ala Ala Ser Lys Gly Ala 
95 100 105 110 

ccg gec acc ccc cac eta gcg gac gee ccc aag gag gec gag aag aag 564 
Pro Ala Thr Pro His Leu Ala Asp Ala Pro Lys Glu Ala Glu Lys Lys 
115 120 125 

gtg gtg ate aag age gag gcg gcg tec ccg gcg ctg ccg gtc ate acc 612 
Val Val He Lys Ser Glu Ala Ala Ser Pro Ala Leu Pro Val He Thr 

130 135 . , ... 140 

aag gtg gag acg ctg age ccc gag age gcg ctg cag ggc age ccg cgc 660 
Lys Val Glu Thr Leu Ser Pro Glu Ser Ala Leu Gin Gly Ser Pro Arg 
145 150 155 

age gcg gee tec acg ccc gec ggc tec ccc gac ggt teg ctg ccg gag 708 
Ser Ala Ala Ser Thr Pro Ala Gly Ser Pro Asp Gly Ser Leu Pro Glu 
160 165 170 

cac cac gec gcg gcg ccc aac ggg ctg cct ggc ttc age gtg gag aac 7 56 
His His Ala Ala Ala Pro Asn Gly Leu Pro Gly Phe Ser Val Glu Asn 
175 180 185 190 

ate atg acc ctg cga acg teg ccg ccg ggc gga gag ctg age ccg ggg 804 
He Met Thr Leu Arg Thr Ser Pro Pro Gly Gly Glu Leu Ser Pro Gly 
195 200 205 

gee gga cgc gcg ggc ctg gtg gtg ccg ccg ctg gcg ctg cca tac gee 852 
Ala Gly Arg Ala Gly Leu Val Val Pro Pro Leu Ala Leu Pro Tyr Ala 
210 215 220 

gec gcg ccg ccc gec gec tac ggc cag ccg tgc get cag ggc ctg gag 900 
Ala Ala Pro Pro Ala Ala Tyr Gly Gin Pro Cys Ala Gin Gly Leu Glu 
225 230 235 

gee ggg gec gee ggg ggc tac cag tgc age atg cga gcg atg age ctg 948 
Ala Gly Ala Ala Gly Gly Tyr Gin Cys Ser Met Arg Ala Met Ser Leu 
240 245 250 
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tac acc ggg gcc gag egg ccg gcg cac atg tgc gtc ccg ccc gec ctg 996 
Tyr Thr Gly Ala Glu Arg Pro Ala His Met Cys Val Pro Pro Ala Leu 
255 260 265 270 

gac gag gcc etc teg gac cac ccg age ggc ccc acg teg ccc ctg age 1044 
Asp Glu Ala Leu Ser Asp His Pro Ser Gly Pro Thr Ser Pro Leu Ser 
275 280 285 

get etc aac etc gcc gcc ggc cag gag ggc gcg etc gcc gcc acg ggc 1092 
Ala Leu Asn Leu Ala Ala Gly Gin Glu Gly Ala Leu Ala Ala Thr Gly 
290 295 300 

cac cac cac cag cac cac ggc cac cac cac ccg cag gcg ccg ccg ccc 1140 
His His His Gin His His Gly His His His Pro Gin Ala Pro Pro Pro 
305 310 315 

ccg ccg get ccc cag ccc cag ccg acg ccg cag ccc ggg gcc gcc gcg 1188 
Pro Pro Ala Pro Gin Pro Gin Pro Thr Pro Gin Pro Gly Ala Ala Ala 
320 325 330 

gcg cag gcg gcc tec tgg tat etc aac cac age ggg gac ctg aac cac 1236 
Ala Gin Ala Ala Ser Trp Tyr Leu Asn His Ser Gly Asp Leu Asn His 
335 340 345 350 

etc ccc ggc cac acg ttc gcg gcc cag cag caa act ttc ccc aac gtg 1284 
Leu Pro Gly His Thr Phe "Ala Ala Gin Gin Gin Thr Phe Pro Asn Val 
355 360 365 

egg gag atg ttc aac tec cac egg ctg ggg att gag aac teg acc etc 1332 
Arg Glu Met Phe Asn Ser His Arg Leu Gly He Glu Asn Ser Thr Leu 
370 375 380 

ggg gag tec cag gtg agt ggc aat gcc age tgc cag ctg ccc tac aga 1380 
Gly Glu Ser Gin Val Ser Gly Asn Ala Ser Cys Gin Leu Pro Tyr Arg 
385 390 395 

tec acg ccg cct etc tat cgc cac gca gcc ccc tac tec tac gac tgc 1428 
Ser Thr Pro Pro Leu Tyr Arg His Ala Ala Pro Tyr Ser Tyr Asp Cys 
400 405 410 

acg aaa tac tgacgtgtcc cgggacctcc cctccccggc ccgctccggc 1477 

Thr Lys Tyr 

415 





ttcgcttccc 


agccccgacc 


caaccagaca 


attaaggggc 


tgcagagacg 


caaaaaagaa 


1537 


* 

* 


acaaaacatg 


tccaccaacc 


ttttctcaga 


cccgggagca 


gagageggge 


acgctagccc 


1597 




ccagccgtct 


gtgaagagcg 


caggtaactt 


taattcgccg 


ccccgtttct 


gggatcccag 


1657 


* • * 


gaaacccctc 


caaagggacg 


cagcccaaca 


aaatgagtat 


tggtcttaaa 


atccccctcc 


1717 


* • 


cctaccagga 


cggctgtgct 


gtgctcgacc 


tgagctttca 


aaagttaagt 


tatggaccca 


1777 




aatcccatag 


cgagccccta 


gtgactttct 


gtaggggtcc 


ccataggtgt 


atgggggtct 


1837 




ctatagataa 


tatatgtgct 


gtgtgtaatt 


ttaaatttct 


ccaaccgtgc 


tgtacaaatg 


1897 
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tgtggatttg taatcaggct attttgttgt 
tatttaaagt tgagttcact ggataagttt 
aattgaattc aagaaaccga tgtgggtttt 
ttttcccatt gtaggtcttt tacaaaacaa 
aagaagtcaa gtatctgata ctttttattt 
ccaccctgag tattcctaaa agaaaaaaaa 
gtgtttgtct tacgtggtct taatcgttgt 
ttctgcccaa agtttggaca gtgtgtttgt 
ttgcaaaccc acctgctttg attatttttg 
aaaaacgacc agagaatagg agcacacacc 
tttgattaat tttaaaatcc cactctagga 
agggggctgc ctgaaggacc ggaccatgag 
aaaacatgct aattgtcaaa acaaacagtg 
gtttaggagg tgaaggaagg gaagaataaa 
tgagtcctgc tttcccccga ttttataaaa 
tccagagaat ttgaactggg ccaatgaagg 
tatccatccc ccttagcggc cagatcagag 
tcatgtctaa atctacaccc tcatcgcagt 
aaataattta tgatattttt agagttctaa 
agattatttt ttgttttcaa tgttaatatc 
tttttgtgaa atgtagctga cgaaatggct 
cactcccctg ctggcctcta tgtgtgaatt 
cagcaggtga gctgtacctt gctaatgctg 
cagatgattt tgctaaaggt gattttgctt 
tttttgtcgt gttgttttca acatggcact 
ggggaagctt aaagcataat aattgtcccc 
ggccccagag gtggcacaca gttaagactt 
cgggttcccg ggtgtccggg tgctagactt 
ttgagtgaac aggaggagga agttaaggag 
gcagaagaag gtggcccaag ctccttgcct 



tgttgttgtt 


gttcagagcc 


attaatataa 


1957 


ttcatcttgc 


ccaaccattt 


ctaactgcca 


2017 


gtttcctgta 


caattatgag 


atataattct 


2077 


gaaaataatt 


tatttttttg 


ttggtggata 


2137 


acaaagtgtg 


atggttttgt 


atagtaggtt 


2197 


aaaaaaagct 


taaaaactct 


aacttcatct 


2257 


acttacctta 


aaataaaccc 


atgttgtttt 


2317 


gttgttgcat 


tttttacaaa 


cgaggtgtgt 


2377 


ttacacaggt 


gggtatatgt 


gtagacacat 


2437 


tgctgtcttg 


tttagtgaca 


gaaaaaggct 


2497 


ttttttcttt 


tcgagaaacc 


gcccagttgg 


2557 


tttgccgtga 


tgcattttct 


taaatgcaca 


2617 


ccactccatc 


tcagtgtcca 


gccgtcccca 


2677 


catttcccgt 


ttgctaactg 


caacccaggg 


2737 


tttgagcctc 


tttgcctgct 


ttaatagttt 


2797 


tctgaagggg 


acggattttc 


tagcgtttga 


2857 


gggaatttca 


gactttatta 


cttctcaatg 


2917 


gaaaaatttt 


aaaacctcat 


tacccttcaa 


2977 


attcaagttt 


ttcaatatgt 


taaataatag 


3037 


tcgtctttta 


catttttaat 


agtaacatag 


3097 


ttattatcta 


tttcaatggc 


tgaagtccac 


3157 


tggggaccaa 


agcttcatca 


attcccaccc 


3217 


aagttctttg 


tgagcttaac 


gtttcaagac 


3277 


gatgcagtgg 


cgctgaacgt 


aacccgggtg 


3337 


ttatctccac 


gctatgttga 


aatagaatta 


3397 


acatgtgcaa 


cacagactct 


ttcaatctgt 


3457 


ggcggctgtc 


tcattctttt 


tcataatgtg 


3517 


tcagcaggcc 


ccaggccaga 


cgggctttgg 


3577 


gtaggggtgg 


ggagagaccc 


tctccaagct 


3637 


gcgtctgccg 


tgatggtttc 


attttacttc 


3697 
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tgctcgcttc 


atgctatttg 


ccccaggaga 


agaggagagt 


attccagacg 


gtaagcgagc 


3757 


tggctttttc 


ccttccctag 


acgttttaaa 


gaaatctttc 


tgaaagcttg 


ccctcatcgt 


JO X 1 


aagctttgaa 


accgttggtg 


tcctgttagt 


ggcgagggct 


gagagacacg 


cggagaaata 


Kill 


aaggagagcg 


acggtgtggc 


tgagagcccc 


caggtctgct 


gttgaaacta 


agctgggctt 


3937 


ttgcaccttt 


aggaagcctt 


tttaaagaag 


tcctgctgtg 


tgggggccgg 


aagcccaagt 


3997 


gagtgggcct 


tgtggaggtt 


atcgggaggg 


gtctttacca 


ctccttgggg 


aacgtgggca 


4057 


acggggggat 


tgtatctgaa 


gctttattca 


ggtcttcggc 


ggcagcagag 


tggagaacca 


4117 


ggcccttagt 


gtgtagcggc 


ctggggattt 


tgggactcat 


c 




4158 



<210> 4 

<211> 417 

<212> PRT 

<213> Homo sapiens 

<400> 4 

Met Asn Ser Gin Glu Gly Cys Lys Glu Thr Leu Asn Gly lie Tyr Gin 
1 5 10 15 , 

Phe lie Met Asp Arg Phe Pro Phe Tyr Arg Glu Asn Lys Gin Gly Trp 
20 25 30 

Gin Asn Ser lie Arg His Asn Leu Ser Leu Asn Glu Cys Phe Val Lys 
35 40 45 

Val Pro Arg Asp Asp Lys Lys Pro Gly Lys Gly Ser Tyr Trp Thr Leu 
50 55 60 

Asp Pro Asp Ser Tyr Asn Met Phe Glu Asn Gly Ser Phe Leu Arg Arg 
65 70 75 80 

Arg Arg Arg Phe Lys Lys Lys Asp Val Ser Lys Glu Lys Glu Glu Arg 
85 90 95 

Ala His Leu Lys Glu Pro Pro Pro Ala Ala Ser Lys Gly Ala Pro Ala 
100 105 110 

Thr Pro His Leu Ala Asp Ala Pro Lys Glu Ala Glu Lys Lys Val Val 
115 120 125 

lie Lys Ser Glu Ala Ala Ser Pro Ala Leu Pro Val lie Thr Lys Val 
130 135 140 

Glu Thr Leu Ser Pro Glu Ser Ala Leu Gin Gly Ser Pro Arg Ser Ala 
145 150 155 160 

Ala Ser Thr Pro Ala Gly Ser Pro Asp Gly Ser Leu Pro Glu His His 
165 170 175 



Ala Ala Ala Pro 
180 



Asn Gly Leu Pro Gly Phe 
185 



Ser Val Glu Asn lie Met 
190 
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Thr Leu Arg Thr Ser Pro Pro Gly Gly Glu Leu Ser Pro Gly Ala Gly 
195 200 205 

Arg Ala Gly Leu Val Val Pro Pro Leu Ala Leu Pro Tyr Ala Ala Ala 
210 215 220 

Pro Pro Ala Ala Tyr Gly Gin Pro Cys Ala Gin Gly Leu Glu Ala Gly 
225 230 235 240 

Ala Ala Gly Gly Tyr Gin Cys Ser Met Arg Ala Met Ser Leu Tyr Thr 
245 250 255 

Gly Ala Glu Arg Pro Ala His Met Cys Val Pro Pro Ala Leu Asp Glu 
260 265 270 

Ala Leu Ser Asp His Pro Ser Gly Pro Thr Ser Pro Leu Ser Ala Leu 
275 280 285 

Asn Leu Ala Ala Gly Gin Glu Gly Ala Leu Ala Ala Thr Gly His His 
290 295 300 

His Gin His His Gly His His His Pro Gin Ala Pro Pro Pro Pro Pro 
305 310 315 320 

Ala Pro Gin Pro Gin Pro Thr Pro Gin Pro Gly Ala Ala Ala AlavGln, 

v \ 1 -."325 ' ; \;^ri"t~330 !i '\.. ' ' ■ ; '335!'.*' '1.'.' V 

Ala Ala Ser Trp Tyr Leu Asn His Ser Gly Asp Leu Asn His Leu Pro 
340 345 350 

Gly His Thr Phe Ala Ala Gin Gin Gin Thr Phe Pro Asn Val Arg Glu 
355 360 365 

Met Phe Asn Ser His Arg Leu Gly lie Glu Asn Ser' Thr Leu Gly Glu 
370 375 380 

Ser Gin Val Ser Gly Asn Ala Ser Cys Gin Leu Pro Tyr Arg Ser Thr 
385 390 395 400 

Pro Pro Leu Tyr Arg His Ala Ala Pro Tyr Ser Tyr Asp Cys Thr Lys 
405 410 415 



Tyr 



<210> 5 

<211> 6021 

<212> DNA 

<213> Mus musculus 

<220> 

<221> exon 

<222> (1649) . . (4348) 

<300> 

<308> GenBank/Y08222 
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<309> 1997-05-14 
<300> 

<301> Miura, N 
<303> Genomics 
<304> 41 
<306> 489-492 
<307> 1997 



<400> 5 

ctcgagtcaa 

cactaaagtt 

tgccgaaaat 

gaagaaagac 

aacttccaga 

aggaagcgga 

tccacaaata 

gggaactccg 

cagttaccag 

aacaaacagc 

gacccacgcc 

ctcctttgcg 

cgaacgggga 

tcataggtgg 

tgcgggagag 

aattgtttgg 

tccttaggtg 

ccgtcgtgaa 

caacctggaa 

tccgtttgct 

ctcttcatcc 

cttgggggca 

tttttatttt 

acaaaatgtg 

gttaacttga 

ggattcaggc 

tgcaaacttc 

acttttgaaa 

cgcagcctct 

cgagagccgc 

gtggcgtctg 

ggggccgcct 

ccggcgtctt 

cgtcccttcc 

aggcggcgac 

acgccctggg 

acggcggcat 

gcatgggccg 

agccgcccta 

agatcactct 

acaagcaggg 

aagtgccgcg 

cctacaacat 

atgtgcccaa 

agggcgctcc 

tggttaagag 

gccccgaggg 

cagacggctc 

tggagaccat 



aggtagcaca 
tcctcacccg 
ctaaaggggg 
tgagacaaat 
aggttctgcg 
ccgagcaggg 
aacaaactgt 
agtcgctgtg 
gccctctaag 
ctgaaataca 
gggccccggc 
tttccagtga 
caccagctcc 
ggagaaggga 
gagaccagga 
aggactcaga 
ttaccttccc 
ggggagagga 
gcgtcctgtg 
ctgaacccat 
attaataata 
aatctctgcc 
ccaaaggaac 
ataataaaat 
gctggggggg 
gctcctcgtt 
ctggaggggg 
cttttcccaa 
ccggacccta 
tgtctccttt 
tgccgccagc 
gccgtgcacc 
ccgcgcgtgg 
tgctctcctg 
cgggcgtctg 
agtggtaccc 
ggccagcccc 
ctcctacgcg 
cagctatata 
gaacggcatc 
ctggcagaac 
cgacgacaag 
gttcgagaat 
ggacaaggag 
gacagggacc 
cgaggcggcg 
agcgctgcag 
gctgccggag 
catgacgctg 



cataaaacct 
ccaaagctga 
tggggggcta 
gttttatctg 
aggcatagag 
atccgatgac 
ccccgggatt 
cgtcaaggtt 
gagcccctgg 
gtcaatttac 
aacagctagg 
cgaagccggc 
cgggggctgg 
gaggccggga 
aagcaacagt 
tggatcacct 
agtttggcat 
accgaattct 
aattatccat 
tacaactagg 
ataaaaaaaa 
caacttcatc 
agggttttta 
aaaattttat 
ggggagatct 
ttgattggtg 
cgcggcctga 
tccctaaaag 
gctcgctgac 
tctagcactc 
tcagggctgc 
cttcaggatg 
accgcgaggc 
ctccgggcct 
ggacgcagca 
tatttgagtg 
atgggcgtct 
ccctaccacc 
gcgctcatca 
taccagttca 
agcatccgcc 
aagccgggca 
ggcagcttcc 
gagcgggccc 
ccggtagctg 
tcccccgcgc 
gccagtccgc 
caccacgccg 
cgcacgtcgc 



attttgctgc 
aacagtgagt 
tggtggtggc 
tcgccttctt 
ccattccgta 
gactggagat 
cctagaggga 
ggcataaaat 
tcctcagctc 
aggatcccaa 
gaagcgggtc 
gatggagtgc 
ctgccttgtc 
tggatggcag 
tgggttcacg 
aagtagcagc 
tcccactaag 
ccaacccggc 
cactgcattc 
ccccgataat 
aatctccagg 
aattcgatgt 
atttttgctc 
aagatgtaac 
ggctaagagc 
ccatccttct 
agaaagtaaa 
ggactttgct 
gctgcgggct 
ggaagggctg 
cacccgccaa 
ccgatccgcc 
tgccccgagt 
cgctcgccgc 
tgcaggcgcg 
agcaaaacta 
actccggcca 
accagcccgc 
ccatggcgat 
tcatggaccg 
acaacctgtc 
agggcagcta 
tgcggcggcg 
acctcaagga 
acgggcccaa 
tgccggtcat 
gcagcgcatc 
cggcgcctaa 
ctccgggcgg 



ttcggtacgt 

tctaatctct 

gtgggggggg 

ccctacccaa 

gggacatctc 

gttgaaggaa 

aggagcacgc 

taaaaaaaaa 

accttatcaa 

agatgctgac 

cgaggctaca 

aggcttggag 

ctaacctcca 

ggaaagctag 

cgcttccctg 

gaagacgaag 

ccttccctcc 

ctcctttgtg 

aacaggccct 

taagaaatct 

ctctttccta 

tatatttcaa 

tggacacgtg 

tcatttttaa 

atctgggtct 

cgcagctgcc 

aactcgcttt 

tctttttccg 

gcagttctcc 

gtgtcgctcc 

gccgagagtg 

cggtcggctg 

cggggctgcc 

gggccgcagt 

ttactcggta 

ctaccgggcg 

cccggagcag 

ggcgcccaag 

ccagaacgcg 

tttccccttc 

actcaatgag 

ctggacgctc 

gcggcgcttc 

gccgccctcg 

ggaggccgag 

caccaaggtg 

ctccacgccc 

cgggctgccc 

cgatctgagc 



caagcaatgc 

caaagccttt 

ggtcggagaa 

ccggaccaac 

ggtgcttctg 

taaataccag 

ttgaaggtcg 

aaaagtcctt 

aactcagtaa 

cgcggagtgg 

cagtgccgcg 

ctccccacgc 

gacagcgctt 

ccctcgtcta 

aaccccacga 

gaccaatggt 

cagcccgacc 

ggctcttcct 

acacgctcag 

aattattcgc 

cttacaaggt 

actaaacttc 

gtctcgttaa 

aagtcctcaa 

tagagccgac 

agatgattgg 

gagccagaag 

ggctcggccg 

tggcggggcc 

acggtcgcgc 

cgcggccagc 

aacccgagcg 

tgcatcgctc 

cggtgcgcgc 

tcggacccca 

gccggcagct 

tacggcgccg 

gacctggtga 

ccagagaaga 

taccgcgaga 

tgcttcgtga 

gacccggact 

aagaagaagg 

accacggcca 

aagaaagtcg 

gagacgctga 

gcaggttccc 

ggct tcagcg 

ccagcggccg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 



cgcgcgccgg cctggtggtg ccaccgctgg 
cttacacgca gccgtgcgcg cagggcctgg 
gtatgcgggc tatgagtctg tacaccgggg 
ccgcgctgga cgaggctctg tcggaccacc 
tcaacctcgc agcgggtcag gagggcgcgt 
acggccacct ccacccgcag gcgccaccgc 
ccgccaccca ggccacctcc tggtatctga 
gccacacgtt tgcaacccaa cagcaaactt 
accggctagg actggacaac tcgtccctcg 
agctgcccta tcgagctacg ccgtccctct 
gcaccaaata ctgaggctgt ccagtccgct 
ctccatggga accttcttcg acggagccgc 
aaccaggagc agagagctcc gtgcaactcg 
tctcagcgag tccctctaag ggggatgcag 
taattccttc ccctacccag atgctgcgcc 
tggaccaaac ccatagggac ccctaatgac 
gtctctccgg ataaggtgcc ttctgtaaac 
tcttgcccag agcctttaat ataatattta 
tgcccaactg ttactgccaa attgaattca 
tcaccatgat aaaataggtc cctccccaaa 
tttatttttt tgttgttgtt ggataacgaa 
gtgcatggct ttgtacagta gatgccatct 
taaaatttca atctcacctg tgtttgtctt 
ataaacccgt gttgtttttc tgcccaaagt 
aaaagggaaa ttgtagtaag ccagttgtga 
acgtggatgc atatacaggt tacaggacga 
cttgatttgt tgaactatcc cgtcctgaga 
gaaactgttc tccatccaca cacggacagg 
ttaacgaaat gctttgcggg atgcagaaaa 
caccctgtct cggtgtccag ctgtcctctg 
ggcctgcagc ttgctaacct cagcgtagca 
tttccttaga tgcggacttg ttgcccctgt 
agagggttgg taggtctctg gtatttaact 
ttcagtctga tttatttctt aatttgggct 
agaagccact gtgcgcctcc agcatgatat 
atgctaagta acaggagatt atttttcttt 
aaaaaggaaa tagatcggga caaactctct 
cttaccaatc tgctgcctga aagatacagc 
gcatatcatg gattcccacg ccagttggta 
gcacagaaca tgtaggccag gaggaggcag 
catctgctta gcttagtggt ggccacgggt 
tccaagtttt atatctgtgc tgttttgatg 
ccctacccgt gtatgtaaga cagtctttca 
cagtatcttc cataaagtgg ggggactaag 
ggccaggtgt atcttggttc ctgagcagag 
ctggttcctc ccaacactgg tttcattttg 
agaagcgagc tttgtccaag ccagctggct 
cctgaaagct tgccctcctc ttaagattca 
tttgtggatg ggagcttttt tttaaagagg 
gagaagcccc ctggagcagg ccctacttgt 
gcttttccca ggcctcccag agcagcggtg 
ctggggtgtt gcttcctcga g 



cactgccata cgccgcagcg ccacccgccg 3000 
aggctgcggg ctccgcgggc taccagtgca 3060 
ccgagcggcc cgcgcacgtg tgcgttccgc 3120 
cgagcggccc cggctccccg ctcggcgccc 3180 
tgggggcctc gggtcaccac caccagcatc 3240 
ccgccccgca gccccctccc gcgccgcagc 3300 
accacggcgg ggacctgagc cacctccccg 3360 
tccccaacgt ccgggagatg ttcaactcgc 3420 
gggagtccca ggtgagcaat gcgagctgtc 3480 
accgccacgc agccccctac tcttacgact 3540 
ccagccccag gaccgcaccg gcttcgcctc 3600 
agaaagcgac ggaaagcgcc cctctctcag 3660 
caggtaactt atccgcagct cagtttgaga 3720 
cccagcaaaa cgaaatacag attttttttt 3780 
tgctcccttg gggcttcata gattagctta 3840 
ttctgtggag attctccacg ggcgcaagag 3900 
gagtgcggat ttgtaaccag gctattttgt 3960 
aagttgtgtc cactggataa ggtttcgtct 4020 
agaaacgtgt gtgggtcttt tctccccacg 4080 
ctgtaggtct tttacaaaac aagaaaataa 4140 
attaagtatc ggatactttt aatttaggaa 4200 
ggggtattcc aaaaacacac caaaagactt 4260 
atgtgatctc agtgttgtat ttaccttaaa 4320 
tcggacagag tctttgtgtt cttgaatttt 4380 
ttgatttttg tgatgcaggt tggcctggta 4440 
tggagctctc gattagtaat agaaggggct 4 500 
tatttttgtt ttctgctcga ggtaatctga 4560 
gctgcctgag ggcaacgtcc tgctggcctg 4620 
ctgtt gccaa ttgt caaaac aaaa tggtgt 4 680 
ttagagggga gaaaccgaga aaggacaaac 4740 
ggagcctggg tgagtgctcg gctccctcca 4800 
tggcgtttta agagtgccag caagaagcaa 4860 
gccggctttg ggatcagatt agaagtgaat 4 920 
ttaaatattt tactccggcg tggtggaaaa 4 980 
tttagcgctg aaatggctct ggttttcagc 5040 
tgattcttgt atttcatttc tttaaaaaaa 5100 
aaaatgtacc tggctggctg gggtggggtc 5160 
ttcagcacag gcctgcgtgt tggactttag 5220 
acctggactg tgctaatgga agttctttct 5280 
ggacccggga ggggggtgga ctttgcaggt 5340 
taacacgtat atagtgttac tgtttgaaac 5400 
tagaatttgg ggaggttcct gatgatacta 54 60 
acctgcagtg ccagaatgtg acccacactt 5520 
aactggacag gggtgctgtg gaggggggca 5580 
cagagagctt aggaaggggt cgggagatct 5640 
catggctctc ttcaaacctc ttgccccagg 5700 
cgctcctttc ccagatgttt taggggcctc 5760 
gaactcctga cccagggaaa gataggaggc 5820 
accgttctcg ttctcaagta ggtagctaga 5880 
gactgtcagg gaacccaggt tgtgttgtag 5940 
tgaaaaaatg cggtcctggg aaaagttggt 6000 

6021 



<210> 6 

<211> 2712 

<212> DNA 

<213> Mus musculus 
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<220> 
<221> CDS 

<222> (422) . . (1906) 
O00> 

<308> GenBank/Y08222 
<309> 1997-05-14 



<300> 

<301> Miura, N 
<303> Genomics 
<304> 41 
<306> 489-492 
<307> 1997 



<400> 6 
agggactttg 


cttctttttc 


cgggctcggc 


cgcgcagcct 


ctccggaccc 


tagctegctg 


60 


acgctgcggg 


ctgcagttct 


cctggcgggg 


cccgagagcc 


gctgtctcct 


tttctagcac 


120 


tcggaagggc 


tggtgtcgct 


ccacggtcgc 


gcgtggcgtc 


tgtgccgcca 


gctcagggct 


180 


gccacccgcc 


aagccgagag 


tgcgcggcca 


gcggggccgc 


ctgccgtgca 


cccttcagga 


240 


tgccgatccg 


cccggtcggc 


tgaacccgag 


cgccggcgtc 


ttccgcgcgt 


ggaccgegag 


300 


gctgccccga 


gtcggggctg 


cctgcatcgc 


tccgtccctt 


cctgctctcc 


tgctccgggc 


360 


ctcgctcgcc 


gcgggccgca 


gtcggtgcgc 


geaggeggeg 


accgggcgtc 


tgggacgcag 


420 


c atg cag gcg cgt tac 
Met Gin Ala Arg Tyr 


teg gta teg gac ccc aacgee ctg gga gtg gta 
Ser Val Ser Asp Pro Asn Ala Leu Gly Val Val 


4 69 



1 5 10 15 

ccc tat ttg agt gag caa aac tac tac egg gcg gec ggc age tac ggc 517 
Pro Tyr Leu Ser Glu Gin Asn Tyr Tyr Arg Ala Ala Gly Ser Tyr Gly 
20 25 30 

ggc atg gec age ccc atg ggc gtc tac tec ggc cac ccg gag cag tac 565 
Gly Met Ala Ser Pro Met Gly Val Tyr Ser Gly His Pro Glu Gin Tyr 
35 40 45 

ggc gec ggc atg ggc cgc tec tac gcg ccc tac cac cac cag ccc gcg 613. 
Gly Ala Gly Met Gly Arg Ser Tyr Ala Pro Tyr His His Gin Pro Ala 
50 55 60 

gcg ccc aag gac ctg gtg aag ccg ccc tac age tat ata gcg etc ate 661 
Ala Pro Lys Asp Leu Val Lys Pro Pro Tyr Ser Tyr lie Ala Leu He 
65 70 75 80 

ace atg gcg ate cag aac gcg cca gag aag aag ate act ctg aac ggc 709 
Thr Met Ala He Gin Asn Ala Pro Glu Lys Lys He Thr Leu Asn Gly 
85 90 95 

ate tac cag ttc ate atg gac cgt ttc ccc ttc tac cgc gag aac aag 757 
He Tyr Gin Phe He Met Asp Arg Phe Pro Phe Tyr Arg Glu Asn Lys 
100 105 HO 
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cag ggc tgg cag aac age ate cgc cac aac ctg tea etc aat gag tgc 805 

Gin Gly Trp Gin Asn Ser II Arg His Asn L u Ser Leu Asn Glu Cys 
115 120 125 

ttc gtg aaa gtg ccg cgc gac gac aag aag ccg ggc aag ggc age tac 853 

Phe Val Lys Val Pro Arg Asp Asp Lys Lys Pro Gly Lys Gly Ser Tyr 

130 135 140 

tgg acg etc gac ccg gac tec tac aac atg ttc gag aat ggc age ttc 901 

Trp Thr Leu Asp Pro Asp Ser Tyr Asn Met Phe Glu Asn Gly Ser Phe 
145 150 155 160 

ctg egg egg egg egg cgc ttc aag aag aag gat gtg ccc aag gac aag 94 9 

Leu Arg Arg Arg Arg Arg Phe Lys Lys Lys Asp Val Pro Lys Asp Lys 
165 170 175 

gag gag egg gec cac etc aag gag ccg ccc teg ace acg gee aag ggc 997 

Glu Glu Arg Ala His Leu Lys Glu Pro Pro Ser Thr Thr Ala Lys Gly 
180 185 190 

get ccg aca ggg ace ccg gta get gac ggg ccc aag gag gee gag aag 1045 

Ala Pro Thr Gly Thr Pro Val Ala Asp Gly Pro Lys Glu Ala Glu Lys 
195 200 205 

aaa gtc gtg gtt aag age gag gcg gcg tec ccc gcg ctg ccg gtc ate 1093 

Lys Val Val Val Lys Ser Glu Ala Ala Ser Pro Ala Leu Pro Val He ; ■ 

210 215 220 

ace aag gtg gag acg ctg age ccc gag gga gcg ctg cag gec agt ccg 1141 

Thr Lys Val Glu Thr Leu Ser Pro Glu Gly Ala Leu Gin Ala Ser Pro 
225 230 * 235 240 

cgc age gca tec tec acg ccc gca ggt tec cca gac ggc teg ctg ccg 1189 

Arg Ser Ala Ser Ser Thr Pro Ala Gly Ser Pro Asp Gly Ser Leu Pro 
245 250 255 

gag cac cae gec gcg gcg cct aac ggg ctg ccc ggc ttc age gtg gag 1237 

Glu His His Ala Ala Ala Pro Asn Gly Leu Pro Gly Phe Ser Val Glu 
260 265 270 

ace ate atg acg ctg cgc acg teg cct ccg ggc ggc gat ctg age cca 1285 

Thr He Met Thr Leu Arg Thr Ser Pro Pro Gly Gly Asp Leu Ser Pro 
275 280 285 

gcg gec gcg cgc gee ggc ctg gtg gtg cca ccg ctg gca ctg cca tac 1333 

Ala Ala Ala Arg Ala Gly Leu Val Val Pro Pro Leu Ala Leu Pro Tyr 

290 295 300 

gec gca gcg cca ccc gec get tac acg cag ccg tgc gcg cag ggc ctg 1381 

Ala Ala Ala Pro Pro Ala Ala Tyr Thr Gin Pro Cys Ala Gin Gly Leu 
305 310 315 320 

gag get gcg ggc tec gcg ggc tac cag tgc agt atg egg get atg agt 1429 

Glu Ala Ala Gly Ser Ala Gly Tyr Gin Cys Ser Met Arg Ala Met Ser 
325 330 335 

ctg tac ace ggg gee gag egg ccc gcg cac gtg tgc gtt ccg ccc gcg 1477 

Leu Tyr Thr Gly Ala Glu Arg Pro Ala His Val Cys Val Pro Pro Ala 
340 345 350 
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ctg gac gag get ctg teg gac cac ccg age ggc ccc ggc tec ccg etc 1525 
Leu Asp Glu Ala Leu Ser Asp His Pro Ser Gly Pro Gly Ser Pro Leu 
355 360 365 

ggc gee etc aac etc gca gcg ggt cag gag ggc gcg ttg ggg gee teg 1573 
Gly Ala Leu Asn Leu Ala Ala Gly Gin Glu Gly Ala Leu Gly Ala Ser 
370 375 380 

ggt cac cac cac cag cat cac ggc cac etc cac ccg cag gcg cca ccg 1621 
Gly His His His Gin His His Gly His Leu His Pro Gin Ala Pro Pro 
385 390 395 400 

ccc gec ccg cag ccc cct ccc gcg ccg cag ccc gec ace cag gee ace 1669 
Pro Ala Pro Gin Pro Pro Pro Ala Pro Gin Pro Ala Thr Gin Ala Thr 
405 410 415 

tec tgg tat ctg aac cac ggc ggg gac ctg age cac etc ccc ggc cac 1717 
Ser Trp Tyr Leu Asn His Gly Gly Asp Leu Ser His Leu Pro Gly His 
420 425 430 

acg ttt gca acc caa cag caa act ttc ccc aac gtc egg gag atg ttc 1765 
Thr Phe Ala Thr Gin Gin Gin Thr Phe Pro Asn Val Arg Glu Met Phe 
435 440 445 

aac teg cac egg eta gga ctg gac aac teg tec etc ggg gag tec cag 1813: 
Asn Ser His Arg Leu Gly Leu" Asp Asn Ser Ser Leu Gly Glu Ser Gin . 
450 455 460 

gtg age aat gcg age tgt cag ctg ccc tat cga get acg ccg tec etc 1861 
Val Ser Asn Ala Ser Cys Gin Leu Pro Tyr Arg Ala Thr Pro Ser Leu 
465 470 475 480 

tac cgc cac gca gee ccc tac tct tac gac tgc acc aaa tac tga 1906 
Tyr Arg His Ala Ala Pro Tyr Ser Tyr Asp Cys Thr Lys Tyr 





485 




490 




495 




ggctgtccag 


tccgctccag 


ccccaggacc 


gcaccggctt 


cgcctcctcc 


atgggaacct 


1966 


tettcgaegg 


ageegcagaa 


agegaeggaa 


agcgcccctc 


tctcagaacc 


aggagcagag 


2026 


agctccgtgc 


aactegcagg 


taacttatcc 


gcagctcagt 


ttgagatctc 


agcgagtccc 


2086 


tctaaggggg 


atgcagccca 


gcaaaacgaa 


atacagattt 


tttttttaat 


tccttcccct 


2146 


acccagatgc 


tgcgcctgct 


cccttggggc 


ttcatagatt 


agcttatgga 


ccaaacccat 


2206 


agggacccct 


aatgacttct 


gtggagattc 


tccacgggcg 


caagaggtct 


etceggataa 


2266 


ggtgccttct 


gtaaacgagt 


gcggatttgt 


aaccaggcta 


ttttgttctt 


gcccagagcc 


2326 


tttaatataa 


tatttaaagt 


tgtgtccact 


ggataaggtt 


tcgtcttgcc 


caactgttac 


2386 


tgccaaattg 


aattcaagaa 


acgtgtgtgg 


gtcttttctc 


cccacgtcac 


catgataaaa 


2446 


taggtccctc 


cccaaactgt 


aggtctttta 


caaaacaaga 


aaataattta 


tttttttgtt 


2506 


gttgttggat 


aacgaaatta 


agtateggat 


acttttaatt 


taggaagtgc 


atggctttgt 


2566 
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acagtagatg ccatctgggg tattccaaaa acacaccaaa agactttaaa atttcaatct 2626 
cacctgtgtt tgtcttatgt gatctcagtg ttgtatttac cttaaaataa acccgtgttg 2686 
tttttctgcc caaaaaaaaa aaaaaa 2712 
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Lys 
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Ser 
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160 


Leu 


Arg 


Arg 


Arg 


Arg 


Arg 
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Lys 


Lys 


Asp 


Val 
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Lys 


Asp 


Lvs 
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175 




Glu 


Glu 
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His 
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Glu 
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Pro 
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Thr 
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Ala 


Lys 


Gly 
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Pro 
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Glu 
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Glu 
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Ser 
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Thr 


Lys 


Val 


Glu 


Thr 


Leu 


Ser 


Pro 


Glu 


Gly 


Ala 


Leu 


Gin 


Ala 


Ser 


Pro 
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Arg 


Ser 


Ala 


Ser 


Ser 


Thr 


Pro 


Ala 


Gly 


Ser 


Pro 


Asp 


Gly 


Ser 


Leu 


Pro 










245 










250 










255 
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His 


His 
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Ala 


Ala 


Pro 
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Gly 


Leu 


Pro 


Gly 


Phe 


Ser 


Val 


Glu 
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Thr 
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Pro 
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Gly 


Asp 


Leu 
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Pro 
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Val 
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Pro 


Leu 
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Leu 


Pro 
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Leu 
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320 


Glu 


Ala 


Ala 


Gly 


Ser 
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360 
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Gly Ala Leu Asn Leu Ala Ala Gly Gin Glu Gly Ala Leu Gly Ala Ser 

370 375 380 

Gly His His His Gin His His Gly His Leu His Pro Gin Ala Pro Pro 
385 390 395 400 

Pro Ala Pro Gin Pro Pro Pro Ala Pro Gin Pro Ala Thr Gin Ala Thr 

405 410 415 

Ser Trp Tyr Leu Asn His Gly Gly Asp Leu Ser His Leu Pro Gly His 

420 425 430 

Thr Phe Ala Thr Gin Gin Gin Thr Phe Pro Asn Vax Arg Glu Met Phe 

435 440 445 

Asn Ser His Arg Leu Gly Leu Asp Asn Ser Ser Leu Gly Glu Ser Gin 

450 455 460 

Val Ser Asn Ala Ser Cys Gin Leu Pro Tyr Arg Ala Thr Pro Ser Leu 
465 470 475 480 

Tyr Arg His Ala Ala Pro Tyr Ser Tyr Asp Cys Thr Lys Tyr 
485 490 



<210> 8 

<211> 3289 

<212> DNA 

<213> Homo sapiens 

<300> 

<301> Miura, N 
<303> Genomics 
<304> 41 
<306> 489-492 
<307> 1997 



<300> 

<308> GenBank/Y08223 
<309> 1997-05-14 

<400> 8 

gaattcggag gattaagttg tcagtcagca cgttgctacc ttcccctcta tgcactccgc 60 
tgcctggctc ctcggcgggg agcgagggaa actcagtttg tagggtttac ctctaaaacc 120 
tcgataggtt atccttgacg accccgagcc tggaaactcc ctgttgatga ttaattattt 180 
gattaaataa gtataacatc caggagaggc cctgccattc caatccagcg cgtttgcttt 240 
tgaatccatt acacctgggc ccccataatt aggaaatcta attattcgct tcatcactca 300 
ttaataagaa aaatgtccca ggatcattgc tacttacaag gtctttggga gagatatttt 360 
actctattaa tccattctat tttatatttc aaattgattt tttttaacag aggaaagtgg 420. 
ctatcttttt gttttgggca tgtgggccca ttcaccaaaa tgtgatcata aaataaattt 480 
taataagata taacttttta aaaagttttc aagtgaagac ggagtcgccg cggaggccgg 540 
ggcggcgggg tcttagagcc gacggattcc tgcgctcctc gccccgattg gcgccggact 600 
cctctcagct gccgggtgat tggctcaaag ttccgggagg gggcgtggcc cgaggaaagt 660 
aaaaactcgc tttcagcaag aagacttttg aaacttttcc caatccctaa aagggacttg 720 
gcctcttttt ctgggctcag cggggcagcc gctcggaccc cggcgcgctg accctcgggg 780 
ctgccgattc gctgggggct tggagagcct cctgcgcccc tcctcgcgcg ggccgagggt 840 
ccaccttggt ccccaggccg cggcgtctcc gctgggtccg cggccgcccg cctgcccgcg 900 
ctgccgccgc cgggtcctgg agccagcgag gagcggggcc ggcgctgcgc ttgcccgggg 960 
cgcgccctcc aggatgccga tccgcccggt ccgctgaaag cgcgcgcccc tgctcggccc 1020 
gagcgacgac gaccgcgcac cctcgccccg gaggctgcca ggagaccggg gccgcccctc 1080 
ccgctcccct cctctccccc tctggctctc tcgcgctctc tcgctctcag ggcccccctc 1140 
gctcccccgg ccgcagtccg tgcgcgaggg cgccggcgag ccgtctcgga agcagcatgc 1200 
aggcgcgcta ctccgtgtcc gaccccaacg ccctgggagt ggtgccctac ctgagcgagc 1260 
agaattacta ccgggctgcg ggcagctacg gcggcatggc cagccccatg ggcgtctatt 1320 
ccggccaccc ggagcagtac agcgcgggga tgggccgctc ctacgcgccc taccaccacc 1380 
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accagcccgc 
ccatggccat 
tcatggaccg 
acaacctctc 
agggcagtta 
tgcggcgccg 
acctcaagga 
acgcccccaa 
tgccggtcat 
gcagcgcggc 
cggcgcccaa 
cgccgggcgg 
cgctgccata 
aggccggggc 
ccgagcggcc 
cgagcggccc 
tcgccgccac 
ccccgccggc 
cctcctggta 
cccagcagca 
agaactcgac 
gatccacgcc 
gacgtgtccc 
aaccagacaa 
tttctcagac 
aggtaacttt 
agcccaacaa 
tgctcgacct 
tgactttctg 
tgtgtaattt 
ttttgttgtt 
gataagtttt 



ggcgcctaag 
ccagaacgcg 
cttccccttc 
gctcaacgag 
ctggaccctg 
gcggcgcttc 
gccgcccccg 
ggaggccgag 
caccaaggtg 
ctccacgccc 
cgggctgcct 
agagctgagc 
cgccgccgcg 
cgccgggggc 
ggcgcacatg 
cacgtcgccc 
gggccaccac 
tccccagccc 
tctcaaccac 
aactttcccc 
cctcggggag 
gcctctctat 
gggacctccc 
ttaaggggct 
ccgggagcag 
aattcgccgc 
aatgagtatt 
gagctttcaa 
taggggtccc 
taaatttctc 
gttgttgttg 
tcatcttgcc 



gacctggtga 
cccgagaaga 
taccgggaga 
tgcttcgtca 
gacccggact 
aaaaagaagg 
gcggcgtcca 
aagaaggtgg 
gagacgctga 
gccggctccc 
ggcttcagcg 
ccgggggccg 
ccgcccgccg 
taccagtgca 
tgcgtcccgc 
ctgagcgctc 
caccagcacc 
cagccgacgc 
agcggggacc 
aacgtgcggg 
tcccaggtga 
cgccacgcag 
ctccccggcc 
gcagagacgc 
agagcgggca 
cccgtttctg 
ggtcttaaaa 
aagttaagtt 
cataggtgta 
caaccgtgct 
ttcagagcca 
caaccatttc 



agccgcccta 
agatcacctt 
acaagcaggg 
aggtgccccg 
cctacaacat 
acgtgtccaa 
agggcgcccc 
tgatcaagag 
gccccgagag 
ccgacggttc 
tggagaacat 
gacgcgcggg 
cctacggcca 
gcatgcgagc 
ccgccctgga 
tcaacctcgc 
acggccacca 
cgcagcccgg 
tgaaccacct 
agatgttcaa 
gtggcaatgc 
ccccctactc 
cgctccggct 
aaaaaagaaa 
cgctagcccc 
cjgatcccagg 
.'tccccQtccc' 
atggacccaa 
tgggggtctc 
gtacaaatgt 
ttaatataat 
taactgccaa 



cagctacatc 
gaacggcatc 
ctggcagaac 
cgacgacaag 
gttcgagaac 
ggagaaggag 
ggccaccccc 
cgaggcggcg 
cgcgctgcag 
gctgccggag 
catgaccctg 
cctggtggtg 
gccgtgcgct 
gatgagcctg 
cgaggccctc 
cgccggccag 
ccacccgcag 
ggccgccgcg 
ccccggccac 
ctcccaccgg 
cagctgccag 
ctacgactgc 
tcgcttccca 
caaaacatgt 
cagccgtctg 
aaacccctcc 
ctaccaggac 
atcccatagc 
tatagataat 
gtggatttgt 
atttaaagtt 
attgaattc 



gcgctcatca 

taccagttca 

agcatccgcc 

aagcccggca 

ggcagcttcc 

gagcgggccc 

cacctagcgg 

tccccggcgc 

ggcagcccgc 

uaccacgccg 

cgaacgtcgc 

ccgccgctgg 

cagggcctgg 

tacaccgggg 

tcggaccacc 

gagggcgcgc 

gcgccgccgc 

gcgcaggcgg 

acgttcgcgg 

ctggggattg 

ctgccctaca 

acgaaatact 

gccccgaccc 

ccaccaacct 

tgaagagcgc 

aaagggacgc 

'ggctgtgctg [ 

gagcccctag 

atatgtgctg 

aatcaggcta 

gagttcactg 
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1620 

1680 
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2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940- 

3000' 

3060 

3120 

3180 

3240 

3289 
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<211> 1506 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1) . . (1506) 

<300> 

<308> GenBank/NM_005251 
<309> 1999-12-23 

<400> 9 

atg cag gcg cgc tac tec gtg tec gac ccc aac gec ctg gga gtg gtg 4 8 
Met Gin Ala Arg Tyr Ser Val Ser Asp Pro Asn Ala Leu Gly Val Val 
15 10 15 

ccc tac ctg age gag cag aat tac tac egg get gcg ggc age tac ggc 96 
Pro Tyr Leu Ser Glu Gin Asn Tyr Tyr Arg Ala Ala Gly Ser Tyr Gly 
20 25 30 



ggc atg gec age ccc atg ggc gtc tat tec ggc cac ccg gag cag tac 144 
Gly Met Ala Ser Pro Met Gly Val Tyr Ser Gly His Pro Glu Gin Tyr 
35 40 45 
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age gcg ggg atg ggc cgc tec tac gcg ccc tac cac cac cac cag ccc 
Ser Ala Gly Met Gly Arg Ser Tyr Ala Pro Tyr His His His Gin Pro 
50 55 60 



192 



gcg gcg cct aag gac ctg gtg aag ccg ccc tac age tac ate gcg etc 
Ala Ala Pro Lys Asp Leu Val Lys Pro Pro Tyr Ser Tyr lie Ala Leu 
65 70 75 80 



240 



ate acc atg gec ate cag aac gcg ccc gag aag aag ate acc ttg aac 
lie Thr Met Ala lie Gin Asn Ala Pro Glu Lys Lys lie Thr Leu Asn 
85 90 95 



288 



ggc ate tac cag ttc ate atg gac cgc ttc ccc ttc tac egg gag aac 
Gly lie Tyr Gin Phe lie Met Asp Arg Phe Pro Phe Tyr Arg Glu Asn 
100 105 110 



336 



aag cag ggc tgg cag aac age ate cgc cac aac etc teg etc aac gag 
Lys Gin Gly Trp Gin Asn Ser lie Arg His Asn Leu Ser Leu Asn Glu 
115 120 125 



384 



tgc ttc gtc aag gtg ccc cgc gac gac aag aag ccc ggc aag ggc agt 
Cys Phe Val Lys Val Pro Arg Asp Asp Lys Lys Pro Gly Lys Gly Ser 
130 135 140 



432 



tac tgg acc ctg gac ccg gac tec tac aac atg ttc gag aac ggc age 
Tyr Trp Thr Leu Asp Pro Asp Ser Tyr Asn Met Phe Glu Asn Gly Ser 
145 150 155 160 



480 



ttc ctg egg cgc egg egg cgc ttc aaa aag aag gac gtg tec aag gag 
Phe Leu Arg Arg Arg Arg Arg Phe Lys Lys Lys Asp Val Ser Lys Glu 
165 170 175 



528 



aag gag gag egg gee cac etc aag gag ccg ccc ccg gcg gcg tec aag 
Lys Glu Glu Arg Ala His Leu Lys Glu Pro Pro Pro Ala Ala Ser Lys 
180 185 190 



576 



ggc gee ccg gec acc ccc cac eta gcg gac gec ccc aag gag gec gag 
Gly Ala Pro Ala Thr Pro His Leu Ala Asp Ala Pro Lys Glu Ala Glu 
195 200 205 



624 



aag aag gtg gtg ate aag age gag gcg gcg tec ccg gcg ctg ccg gtc 
Lys Lys Val Val lie Lys Ser Glu Ala Ala Ser Pro Ala Leu Pro Val 
210 215 220 



672 



ate acc aag gtg gag acg ctg age ccc gag age gcg ctg cag ggc age 
lie Thr Lys Val Glu Thr Leu Ser Pro Glu Ser Ala Leu Gin Gly Ser 
225 230 235 240 



720 



ccg cgc age gcg gec tec acg ccc gee ggc tec ccc gac ggt teg ctg 
Pro Arg Ser Ala Ala Ser Thr Pro Ala Gly Ser Pro Asp Gly Ser Leu 
245 250 255 



7 68 



ccg gag cac cac gec gcg gcg ccc aac ggg ctg cct ggc ttc age gtg 
Pro Glu His His Ala Ala Ala Pro Asn Gly Leu Pro Gly Phe Ser Val 
260 265 270 



816 
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gag aac ate atg acc ctg cga acg teg ccg ccg ggc gga gag ctg age 

Glu Asa lie Met Thr Leu Arg Thr Set Pro Pro Gly Gly Glu Leu Ser 

275 280 285 

ccg ggg gec gga cgc gcg ggc ctg gtg gtg ccg ccg ctg gcg ctg cca 

Pro Gly Ala Gly Arg Ala Gly Leu Val Val Pro Pro Leu Ala Leu Pro 

290 295 300 

tac gec gec gcg ccg ccc gec gec tac ggc cag ccg tgc get cag ggc 

Tyr Ala Ala Ala Pro Pro Ala Ala Tyr Gly Gin Pro Cys Ala Gin Gly 

305 310 315 320 

ctg gag gec ggg gec gee ggg ggc tac cag tgc age atg cga gcg atg 

Leu Glu Ala Gly Ala Ala Gly Gly Tyr Gin Cys Ser Met Arg Ala Met 

325 330 335 

age ctg tac acc ggg gec gag egg ccg gcg cac atg tgc gtc ccg ccc 

Ser Leu Tyr Thr Gly Ala Glu Arg Pro Ala His Met Cys Val Pro Pro 

340 345 350 

gee ctg gac gag gec etc teg gac cac ccg age ggc ccc acg teg ccc 

Ala Leu Asp Glu Ala Leu Ser Asp His Pro Ser Gly Pro Thr Ser Pro 

355 360 365 

ctg age get etc aac etc gec gec ggc cag gag ggc gcg etc gee gee 

Leu Ser Ala Leu Asn Leu Ala Ala Gly Gin Glu Gly Ala Leu Ala Ala 

370 375 380 

acg ggc cac cac cac cag cac cac ggc cac cac cac ccg cag gcg ccg 

Thr Gly His His His Gin His His Gly His His His Pro Gin Ala Pro 

385 390 395 ' 400 

ccg ccc ccg ccg get ccc cag ccc cag ccg acg ccg cag ccc ggg gec 

Pro Pro Pro Pro Ala Pro Gin Pro Gin Pro Thr Pro Gin Pro Gly Ala 

405 410 415 

gee gcg gcg cag gcg gec tec tgg tat etc aac cac age ggg gac ctg 

Ala Ala Ala Gin Ala Ala Ser Trp Tyr Leu Asn His Ser Gly Asp Leu 

420 425 430 

aac cac etc ccc ggc cac acg ttc gcg gec cag cag caa act ttc ccc 

Asn His Leu Pro Gly His Thr Phe Ala Ala Gin Gin Gin Thr Phe Pro 

435 440 445 

aac gtg egg gag atg ttc aac tec cac egg ctg ggg att gag aac teg 

Asn Val Arg Glu Met Phe Asn Ser His Arg Leu Gly Xle Glu Asn Ser 

450 455 460 

acc etc ggg gag tec cag gtg agt ggc aat gec age tgc cag ctg ccc 

Thr Leu Gly Glu Ser Gin Val Ser Gly Asn Ala Ser Cys Gin Leu Pro 

465 470 475 480 

tac aga tec acg ccg cct etc tat cgc cac gca gee ccc tac tec tac 

Tyr Arg Ser Thr Pro Pro Leu Tyr Arg His Ala Ala Pro Tyr Ser Tyr 

485 490 495 



gac tgc acg aaa tac tga 
Asp Cys Thr Lys Tyr 
500 
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Ser 
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Ser 
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His 


His 


His 
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Pro 
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60 










Ala 
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Lys 
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Leu 


Val 


Lys 
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Pro 
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Tyr 


He 
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Leu 


65 










70 










75 










80 


He 


Thr 


Met 


Ala 


He 
85 


Gin 


Asn 


Ala 


Pro 


Glu 
90 


Lys 


Lys 


He 


Thr 


Leu 
95 
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Gly 
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He 
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Phe 
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Pro Pro Pro Pro Ala Pro Gin Pro Gin Pro Thr Pro Gin Pro Gly Ala 

405 410 415 

Ala Ala Ala Gin Ala Ala Ser Trp Tyr Leu Asn His Ser Gly Asp Leu 

420 425 430 

Asn His Leu Pro Gly His Thr Phe Ala Ala Gin Gin Gin Thr Phe Pro 

435 440 445 

Asn Val Arg Glu Met Phe Asn Ser His Arg Leu Gly lie Glu Asn Ser 

450 455 460 

Thr Leu Gly Glu Ser Gin Val Ser Gly Asn Ala Ser Cys Gin Leu Pro 
465 470 475 480 

Tyr Arg Ser Thr Pro Pro Leu Tyr Arg His Ala Ala Pro Tyr Ser Tyr 

485 490 495 

Asp Cys Thr Lys Tyr 
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